This project has retired. For details please refer to its Attic page.
Index (Apache Crunch 0.15.0 API)
Skip navigation links
A B C D E F G H I J K L M N O P Q R S T U V W X Z 

A

AbstractCompositeExtractor<T> - Class in org.apache.crunch.contrib.text
Base class for Extractor instances that delegates the parsing of fields to other Extractor instances, primarily used for constructing composite records that implement the Tuple interface.
AbstractCompositeExtractor(TokenizerFactory, List<Extractor<?>>) - Constructor for class org.apache.crunch.contrib.text.AbstractCompositeExtractor
 
AbstractOffsetReader - Class in org.apache.crunch.kafka.offset
Base implementation of OffsetReader
AbstractOffsetReader() - Constructor for class org.apache.crunch.kafka.offset.AbstractOffsetReader
 
AbstractOffsetWriter - Class in org.apache.crunch.kafka.offset
Base implementation of OffsetWriter
AbstractOffsetWriter() - Constructor for class org.apache.crunch.kafka.offset.AbstractOffsetWriter
 
AbstractSimpleExtractor<T> - Class in org.apache.crunch.contrib.text
Base class for the common case Extractor instances that construct a single object from a block of text stored in a String, with support for error handling and reporting.
accept(T) - Method in class org.apache.crunch.FilterFn
If true, emit the given record.
accept(PCollectionImpl.Visitor) - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
accept(OutputHandler, PType<?>) - Method in interface org.apache.crunch.Target
Checks to see if this Target instance is compatible with the given PType.
ACCEPT_ALL() - Static method in class org.apache.crunch.fn.FilterFns
Accept everything.
addAccumulator(Map<String, Map<String, Long>>, Map<String, Map<String, Long>>) - Method in class org.apache.crunch.impl.spark.CounterAccumulatorParam
 
addCacheFile(Path, Configuration) - Static method in class org.apache.crunch.util.DistCache
 
addCompletionHook(CrunchControlledJob.Hook) - Method in class org.apache.crunch.impl.mr.MRPipeline
 
addInPlace(Map<String, Map<String, Long>>, Map<String, Map<String, Long>>) - Method in class org.apache.crunch.impl.spark.CounterAccumulatorParam
 
addInputPath(Job, Path, FormatBundle, int) - Static method in class org.apache.crunch.io.CrunchInputs
 
addInputPaths(Job, Collection<Path>, FormatBundle, int) - Static method in class org.apache.crunch.io.CrunchInputs
 
addJarDirToDistributedCache(Configuration, File) - Static method in class org.apache.crunch.util.DistCache
Adds all jars under the specified directory to the distributed cache of jobs using the provided configuration.
addJarDirToDistributedCache(Configuration, String) - Static method in class org.apache.crunch.util.DistCache
Adds all jars under the directory at the specified path to the distributed cache of jobs using the provided configuration.
addJarToDistributedCache(Configuration, File) - Static method in class org.apache.crunch.util.DistCache
Adds the specified jar to the distributed cache of jobs using the provided configuration.
addJarToDistributedCache(Configuration, String) - Static method in class org.apache.crunch.util.DistCache
Adds the jar at the specified path to the distributed cache of jobs using the provided configuration.
addKafkaConnectionProperties(Properties, Configuration) - Static method in class org.apache.crunch.kafka.KafkaUtils
Adds the properties to the provided config instance.
addNamedOutput(Job, String, Class<? extends OutputFormat>, Class, Class) - Static method in class org.apache.crunch.io.CrunchOutputs
 
addNamedOutput(Job, String, FormatBundle<? extends OutputFormat>, Class, Class) - Static method in class org.apache.crunch.io.CrunchOutputs
 
addPrepareHook(CrunchControlledJob.Hook) - Method in class org.apache.crunch.impl.mr.MRPipeline
 
age - Variable in class org.apache.crunch.test.Person
Deprecated.
aggregate(Aggregator<S>) - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
Aggregate - Class in org.apache.crunch.lib
Methods for performing various types of aggregations over PCollection instances.
Aggregate() - Constructor for class org.apache.crunch.lib.Aggregate
 
aggregate(PCollection<S>, Aggregator<S>) - Static method in class org.apache.crunch.lib.Aggregate
 
aggregate(Aggregator<S>) - Method in interface org.apache.crunch.PCollection
Returns a PCollection that contains the result of aggregating all values in this instance.
Aggregate.PairValueComparator<K,V> - Class in org.apache.crunch.lib
 
Aggregate.TopKCombineFn<K,V> - Class in org.apache.crunch.lib
 
Aggregate.TopKFn<K,V> - Class in org.apache.crunch.lib
 
Aggregator<T> - Interface in org.apache.crunch
Aggregate a sequence of values into a possibly smaller sequence of the same type.
Aggregators - Class in org.apache.crunch.fn
A collection of pre-defined Aggregators.
Aggregators.SimpleAggregator<T> - Class in org.apache.crunch.fn
Base class for aggregators that do not require any initialization.
and(FilterFn<S>, FilterFn<S>) - Static method in class org.apache.crunch.fn.FilterFns
Accept an entry if all of the given filters accept it, using short-circuit evaluation.
and(FilterFn<S>...) - Static method in class org.apache.crunch.fn.FilterFns
Accept an entry if all of the given filters accept it, using short-circuit evaluation.
apply(Statement, Description) - Method in class org.apache.crunch.test.TemporaryPath
 
applyPTypeTransforms() - Method in interface org.apache.crunch.types.Converter
If true, convert the inputs or outputs from this Converter instance before (for outputs) or after (for inputs) using the associated PType#getInputMapFn and PType#getOutputMapFn calls.
as(PType<T>) - Method in class org.apache.crunch.types.avro.AvroTypeFamily
 
as(PType<T>) - Method in interface org.apache.crunch.types.PTypeFamily
Returns the equivalent of the given ptype for this family, if it exists.
as(PType<T>) - Method in class org.apache.crunch.types.writable.WritableTypeFamily
 
asCollection() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
asCollection() - Method in interface org.apache.crunch.PCollection
 
asMap() - Method in class org.apache.crunch.impl.dist.collect.PTableBase
Returns a PObject encapsulating a Map made up of the keys and values in this PTable.
asMap() - Method in interface org.apache.crunch.PTable
Returns a PObject encapsulating a Map made up of the keys and values in this PTable.
asPTable(PCollection<Pair<K, V>>) - Static method in class org.apache.crunch.lib.PTables
Convert the given PCollection<Pair<K, V>> to a PTable<K, V>.
asReadable(boolean) - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
asReadable() - Method in interface org.apache.crunch.io.ReadableSource
 
asReadable() - Method in class org.apache.crunch.kafka.KafkaSource
 
asReadable(boolean) - Method in interface org.apache.crunch.PCollection
 
asSourceTarget(PType<T>) - Method in interface org.apache.crunch.Target
Attempt to create the SourceTarget type that corresponds to this Target for the given PType, if possible.
At - Class in org.apache.crunch.io
Static factory methods for creating common SourceTarget types, which may be treated as both a Source and a Target.
At() - Constructor for class org.apache.crunch.io.At
 
Average - Class in org.apache.crunch.lib
 
Average() - Constructor for class org.apache.crunch.lib.Average
 
AverageBytesByIP - Class in org.apache.crunch.examples
 
AverageBytesByIP() - Constructor for class org.apache.crunch.examples.AverageBytesByIP
 
AVRO_MODE_PROPERTY - Static variable in class org.apache.crunch.types.avro.AvroMode
 
AVRO_SHUFFLE_MODE_PROPERTY - Static variable in class org.apache.crunch.types.avro.AvroMode
 
AvroDerivedValueDeepCopier<T,S> - Class in org.apache.crunch.types.avro
A DeepCopier specific to Avro derived types.
AvroDerivedValueDeepCopier(MapFn<T, S>, MapFn<S, T>, AvroType<S>) - Constructor for class org.apache.crunch.types.avro.AvroDerivedValueDeepCopier
 
avroFile(String, Class<T>) - Static method in class org.apache.crunch.io.At
Creates a SourceTarget<T> instance from the Avro file(s) at the given path name.
avroFile(Path, Class<T>) - Static method in class org.apache.crunch.io.At
Creates a SourceTarget<T> instance from the Avro file(s) at the given Path.
avroFile(String) - Static method in class org.apache.crunch.io.At
Creates a SourceTarget<GenericData.Record> by reading the schema of the Avro file at the given path.
avroFile(Path) - Static method in class org.apache.crunch.io.At
Creates a SourceTarget<GenericData.Record> by reading the schema of the Avro file at the given path.
avroFile(Path, Configuration) - Static method in class org.apache.crunch.io.At
Creates a SourceTarget<GenericData.Record> by reading the schema of the Avro file at the given path using the FileSystem information contained in the given Configuration instance.
avroFile(String, PType<T>) - Static method in class org.apache.crunch.io.At
Creates a SourceTarget<T> instance from the Avro file(s) at the given path name.
avroFile(Path, PType<T>) - Static method in class org.apache.crunch.io.At
Creates a SourceTarget<T> instance from the Avro file(s) at the given Path.
avroFile(String, Class<T>) - Static method in class org.apache.crunch.io.From
Creates a Source<T> instance from the Avro file(s) at the given path name.
avroFile(Path, Class<T>) - Static method in class org.apache.crunch.io.From
Creates a Source<T> instance from the Avro file(s) at the given Path.
avroFile(List<Path>, Class<T>) - Static method in class org.apache.crunch.io.From
Creates a Source<T> instance from the Avro file(s) at the given Paths.
avroFile(String, PType<T>) - Static method in class org.apache.crunch.io.From
Creates a Source<T> instance from the Avro file(s) at the given path name.
avroFile(Path, PType<T>) - Static method in class org.apache.crunch.io.From
Creates a Source<T> instance from the Avro file(s) at the given Path.
avroFile(List<Path>, PType<T>) - Static method in class org.apache.crunch.io.From
Creates a Source<T> instance from the Avro file(s) at the given Paths.
avroFile(String) - Static method in class org.apache.crunch.io.From
Creates a Source<GenericData.Record> by reading the schema of the Avro file at the given path.
avroFile(Path) - Static method in class org.apache.crunch.io.From
Creates a Source<GenericData.Record> by reading the schema of the Avro file at the given path.
avroFile(List<Path>) - Static method in class org.apache.crunch.io.From
Creates a Source<GenericData.Record> by reading the schema of the Avro file at the given paths.
avroFile(Path, Configuration) - Static method in class org.apache.crunch.io.From
Creates a Source<GenericData.Record> by reading the schema of the Avro file at the given path using the FileSystem information contained in the given Configuration instance.
avroFile(List<Path>, Configuration) - Static method in class org.apache.crunch.io.From
Creates a Source<GenericData.Record> by reading the schema of the Avro file at the given paths using the FileSystem information contained in the given Configuration instance.
avroFile(String) - Static method in class org.apache.crunch.io.To
Creates a Target at the given path name that writes data to Avro files.
avroFile(Path) - Static method in class org.apache.crunch.io.To
Creates a Target at the given Path that writes data to Avro files.
AvroGenericFn(int[], Schema) - Constructor for class org.apache.crunch.lib.sort.SortFns.AvroGenericFn
 
AvroIndexedRecordPartitioner() - Constructor for class org.apache.crunch.lib.join.JoinUtils.AvroIndexedRecordPartitioner
 
AvroInputFormat<T> - Class in org.apache.crunch.types.avro
An InputFormat for Avro data files.
AvroInputFormat() - Constructor for class org.apache.crunch.types.avro.AvroInputFormat
 
AvroMode - Class in org.apache.crunch.types.avro
AvroMode is an immutable object used for configuring the reading and writing of Avro types.
AvroMode.ModeType - Enum in org.apache.crunch.types.avro
Internal enum which represents the various Avro data types.
AvroOutputFormat<T> - Class in org.apache.crunch.types.avro
An OutputFormat for Avro data files.
AvroOutputFormat() - Constructor for class org.apache.crunch.types.avro.AvroOutputFormat
 
AvroPairGroupingComparator() - Constructor for class org.apache.crunch.lib.join.JoinUtils.AvroPairGroupingComparator
 
AvroPathPerKeyOutputFormat<T> - Class in org.apache.crunch.types.avro
A FileOutputFormat that takes in a Utf8 and an Avro record and writes the Avro records to a sub-directory of the output path whose name is equal to the string-form of the Utf8.
AvroPathPerKeyOutputFormat() - Constructor for class org.apache.crunch.types.avro.AvroPathPerKeyOutputFormat
 
Avros - Class in org.apache.crunch.types.avro
Defines static methods that are analogous to the methods defined in AvroTypeFamily for convenient static importing.
AvroSerDe<T> - Class in org.apache.crunch.impl.spark.serde
 
AvroSerDe(AvroType<T>, Map<String, String>) - Constructor for class org.apache.crunch.impl.spark.serde.AvroSerDe
 
avroTableFile(Path, PTableType<K, V>) - Static method in class org.apache.crunch.io.From
Creates a TableSource<K,V> for reading an Avro key/value file at the given path.
avroTableFile(List<Path>, PTableType<K, V>) - Static method in class org.apache.crunch.io.From
Creates a TableSource<K,V> for reading an Avro key/value file at the given paths.
AvroTextOutputFormat<K,V> - Class in org.apache.crunch.types.avro
 
AvroTextOutputFormat() - Constructor for class org.apache.crunch.types.avro.AvroTextOutputFormat
 
AvroType<T> - Class in org.apache.crunch.types.avro
The implementation of the PType interface for Avro-based serialization.
AvroType(Class<T>, Schema, DeepCopier<T>, PType...) - Constructor for class org.apache.crunch.types.avro.AvroType
 
AvroType(Class<T>, Schema, MapFn, MapFn, DeepCopier<T>, AvroType.AvroRecordType, PType...) - Constructor for class org.apache.crunch.types.avro.AvroType
 
AvroType.AvroRecordType - Enum in org.apache.crunch.types.avro
 
AvroTypeFamily - Class in org.apache.crunch.types.avro
 
AvroUtf8InputFormat - Class in org.apache.crunch.types.avro
An InputFormat for text files.
AvroUtf8InputFormat() - Constructor for class org.apache.crunch.types.avro.AvroUtf8InputFormat
 

B

BaseDoCollection<S> - Class in org.apache.crunch.impl.dist.collect
 
BaseDoTable<K,V> - Class in org.apache.crunch.impl.dist.collect
 
BaseGroupedTable<K,V> - Class in org.apache.crunch.impl.dist.collect
 
BaseInputCollection<S> - Class in org.apache.crunch.impl.dist.collect
 
BaseInputCollection(Source<S>, DistributedPipeline) - Constructor for class org.apache.crunch.impl.dist.collect.BaseInputCollection
 
BaseInputCollection(Source<S>, String, DistributedPipeline, ParallelDoOptions) - Constructor for class org.apache.crunch.impl.dist.collect.BaseInputCollection
 
BaseInputTable<K,V> - Class in org.apache.crunch.impl.dist.collect
 
BaseInputTable(TableSource<K, V>, DistributedPipeline) - Constructor for class org.apache.crunch.impl.dist.collect.BaseInputTable
 
BaseInputTable(TableSource<K, V>, String, DistributedPipeline, ParallelDoOptions) - Constructor for class org.apache.crunch.impl.dist.collect.BaseInputTable
 
BaseUnionCollection<S> - Class in org.apache.crunch.impl.dist.collect
 
BaseUnionTable<K,V> - Class in org.apache.crunch.impl.dist.collect
 
bigDecimal(PTypeFamily) - Static method in class org.apache.crunch.types.PTypes
A PType for Java's BigDecimal type.
BIGDECIMAL_TO_BYTE - Static variable in class org.apache.crunch.types.PTypes
 
bigInt(PTypeFamily) - Static method in class org.apache.crunch.types.PTypes
A PType for Java's BigInteger type.
BIGINT_TO_BYTE - Static variable in class org.apache.crunch.types.PTypes
 
BinarySearchNode(K[], RawComparator<K>) - Constructor for class org.apache.crunch.lib.sort.TotalOrderPartitioner.BinarySearchNode
 
BloomFilterFactory - Class in org.apache.crunch.contrib.bloomfilter
Factory Class for creating BloomFilters.
BloomFilterFactory() - Constructor for class org.apache.crunch.contrib.bloomfilter.BloomFilterFactory
 
BloomFilterFn<S> - Class in org.apache.crunch.contrib.bloomfilter
The class is responsible for generating keys that are used in a BloomFilter
BloomFilterFn() - Constructor for class org.apache.crunch.contrib.bloomfilter.BloomFilterFn
 
BloomFilterJoinStrategy<K,U,V> - Class in org.apache.crunch.lib.join
Join strategy that uses a Bloom filter that is trained on the keys of the left-side table to filter the key/value pairs of the right-side table before sending through the shuffle and reduce phase.
BloomFilterJoinStrategy(int) - Constructor for class org.apache.crunch.lib.join.BloomFilterJoinStrategy
Instantiate with the expected number of unique keys in the left table.
BloomFilterJoinStrategy(int, float) - Constructor for class org.apache.crunch.lib.join.BloomFilterJoinStrategy
Instantiate with the expected number of unique keys in the left table, and the acceptable false positive rate for the Bloom filter.
BloomFilterJoinStrategy(int, float, JoinStrategy<K, U, V>) - Constructor for class org.apache.crunch.lib.join.BloomFilterJoinStrategy
Instantiate with the expected number of unique keys in the left table, and the acceptable false positive rate for the Bloom filter, and an underlying join strategy to delegate to.
booleans() - Static method in class org.apache.crunch.types.avro.Avros
 
booleans() - Method in class org.apache.crunch.types.avro.AvroTypeFamily
 
booleans() - Method in interface org.apache.crunch.types.PTypeFamily
 
booleans() - Static method in class org.apache.crunch.types.writable.Writables
 
booleans() - Method in class org.apache.crunch.types.writable.WritableTypeFamily
 
bottom(int) - Method in class org.apache.crunch.impl.dist.collect.PTableBase
 
bottom(int) - Method in interface org.apache.crunch.PTable
Returns a PTable made up of the pairs in this PTable with the smallest value field.
build() - Method in class org.apache.crunch.CachingOptions.Builder
 
build() - Method in class org.apache.crunch.contrib.text.TokenizerFactory.Builder
Returns a new TokenizerFactory with settings determined by this Builder instance.
build() - Method in class org.apache.crunch.GroupingOptions.Builder
 
build() - Method in class org.apache.crunch.kafka.offset.hdfs.Offsets.Builder
Builds an instance.
build() - Method in class org.apache.crunch.kafka.offset.hdfs.Offsets.PartitionOffset.Builder
Builds a PartitionOffset instance.
build() - Method in class org.apache.crunch.ParallelDoOptions.Builder
 
build() - Method in class org.apache.crunch.test.Employee.Builder
 
build() - Method in class org.apache.crunch.test.Person.Builder
 
builder() - Static method in class org.apache.crunch.CachingOptions
Creates a new CachingOptions.Builder instance to use for specifying the caching options for a particular PCollection<T>.
Builder() - Constructor for class org.apache.crunch.CachingOptions.Builder
 
Builder(Class<T>) - Constructor for class org.apache.crunch.contrib.io.jdbc.DataBaseSource.Builder
 
builder() - Static method in class org.apache.crunch.contrib.text.TokenizerFactory
Factory method for creating a TokenizerFactory.Builder instance.
Builder() - Constructor for class org.apache.crunch.contrib.text.TokenizerFactory.Builder
 
builder() - Static method in class org.apache.crunch.GroupingOptions
 
Builder() - Constructor for class org.apache.crunch.GroupingOptions.Builder
 
Builder() - Constructor for class org.apache.crunch.kafka.offset.hdfs.Offsets.Builder
 
Builder() - Constructor for class org.apache.crunch.kafka.offset.hdfs.Offsets.PartitionOffset.Builder
 
builder() - Static method in class org.apache.crunch.ParallelDoOptions
 
Builder() - Constructor for class org.apache.crunch.ParallelDoOptions.Builder
 
bundle - Variable in class org.apache.crunch.io.CrunchOutputs.OutputConfig
 
by(MapFn<S, K>, PType<K>) - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
by(String, MapFn<S, K>, PType<K>) - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
by(SFunction<S, K>, PType<K>) - Method in interface org.apache.crunch.lambda.LCollection
Key this LCollection by a key extracted from the element to yield a LTable mapping the key to the whole element.
by(int, Sort.Order) - Static method in class org.apache.crunch.lib.Sort.ColumnOrder
 
by(MapFn<S, K>, PType<K>) - Method in interface org.apache.crunch.PCollection
Apply the given map function to each element of this instance in order to create a PTable.
by(String, MapFn<S, K>, PType<K>) - Method in interface org.apache.crunch.PCollection
Apply the given map function to each element of this instance in order to create a PTable.
BYTE_TO_BIGDECIMAL - Static variable in class org.apache.crunch.types.PTypes
 
BYTE_TO_BIGINT - Static variable in class org.apache.crunch.types.PTypes
 
ByteArray - Class in org.apache.crunch.impl.spark
 
ByteArray(byte[], ByteArrayHelper) - Constructor for class org.apache.crunch.impl.spark.ByteArray
 
ByteArrayHelper - Class in org.apache.crunch.impl.spark
 
ByteArrayHelper() - Constructor for class org.apache.crunch.impl.spark.ByteArrayHelper
 
bytes() - Static method in class org.apache.crunch.types.avro.Avros
 
bytes() - Method in class org.apache.crunch.types.avro.AvroTypeFamily
 
bytes() - Method in interface org.apache.crunch.types.PTypeFamily
 
bytes() - Static method in class org.apache.crunch.types.writable.Writables
 
bytes() - Method in class org.apache.crunch.types.writable.WritableTypeFamily
 
BYTES_IN - Static variable in class org.apache.crunch.types.avro.Avros
 
BYTES_PER_REDUCE_TASK - Static variable in class org.apache.crunch.util.PartitionUtils
 
BytesDeserializer() - Constructor for class org.apache.crunch.kafka.KafkaSource.BytesDeserializer
 

C

cache() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
cache(CachingOptions) - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
cache() - Method in class org.apache.crunch.impl.dist.collect.PTableBase
 
cache(CachingOptions) - Method in class org.apache.crunch.impl.dist.collect.PTableBase
 
cache(PCollection<T>, CachingOptions) - Method in class org.apache.crunch.impl.mem.MemPipeline
 
cache(PCollection<T>, CachingOptions) - Method in class org.apache.crunch.impl.mr.MRPipeline
 
cache(PCollection<T>, CachingOptions) - Method in class org.apache.crunch.impl.spark.SparkPipeline
 
cache() - Method in interface org.apache.crunch.lambda.LCollection
Cache the underlying PCollection
cache(CachingOptions) - Method in interface org.apache.crunch.lambda.LCollection
Cache the underlying PCollection
cache() - Method in interface org.apache.crunch.PCollection
Marks this data as cached using the default CachingOptions.
cache(CachingOptions) - Method in interface org.apache.crunch.PCollection
Marks this data as cached using the given CachingOptions.
cache(PCollection<T>, CachingOptions) - Method in interface org.apache.crunch.Pipeline
Caches the given PCollection so that it will be processed at most once during pipeline execution.
cache() - Method in interface org.apache.crunch.PTable
 
cache(CachingOptions) - Method in interface org.apache.crunch.PTable
 
CachingOptions - Class in org.apache.crunch
Options for controlling how a PCollection<T> is cached for subsequent processing.
CachingOptions.Builder - Class in org.apache.crunch
A Builder class to use for setting the CachingOptions for a PCollection.
call(Tuple2<IntByteArray, List<byte[]>>) - Method in class org.apache.crunch.impl.spark.collect.ToByteArrayFunction
 
call(Iterator<Tuple2<K, V>>) - Method in class org.apache.crunch.impl.spark.fn.CombineMapsideFunction
 
call(Iterator<Pair<K, V>>) - Method in class org.apache.crunch.impl.spark.fn.CrunchPairTuple2
 
call(Integer, Iterator) - Method in class org.apache.crunch.impl.spark.fn.FlatMapIndexFn
 
call(Iterator<Tuple2<K, V>>) - Method in class org.apache.crunch.impl.spark.fn.FlatMapPairDoFn
 
call(Tuple2<K, V>) - Method in class org.apache.crunch.impl.spark.fn.InputConverterFunction
 
call(Object) - Method in class org.apache.crunch.impl.spark.fn.MapFunction
 
call(Pair<K, V>) - Method in class org.apache.crunch.impl.spark.fn.MapOutputFunction
 
call(S) - Method in class org.apache.crunch.impl.spark.fn.OutputConverterFunction
 
call(Iterator<T>) - Method in class org.apache.crunch.impl.spark.fn.PairFlatMapDoFn
 
call(Tuple2<K, V>) - Method in class org.apache.crunch.impl.spark.fn.PairMapFunction
 
call(Pair<K, List<V>>) - Method in class org.apache.crunch.impl.spark.fn.PairMapIterableFunction
 
call(Pair<K, V>) - Method in class org.apache.crunch.impl.spark.fn.PartitionedMapOutputFunction
 
call(Iterator<Tuple2<ByteArray, List<byte[]>>>) - Method in class org.apache.crunch.impl.spark.fn.ReduceGroupingFunction
 
call(Tuple2<ByteArray, Iterable<byte[]>>) - Method in class org.apache.crunch.impl.spark.fn.ReduceInputFunction
 
call(Pair<K, V>) - Method in class org.apache.crunch.impl.spark.fn.Tuple2MapFunction
 
CAN_COMBINE_SPECIFIC_AND_REFLECT_SCHEMAS - Static variable in class org.apache.crunch.types.avro.Avros
Older versions of Avro (i.e., before 1.7.0) do not support schemas that are composed of a mix of specific and reflection-based schemas.
Cartesian - Class in org.apache.crunch.lib
Utilities for Cartesian products of two PTable or PCollection instances.
Cartesian() - Constructor for class org.apache.crunch.lib.Cartesian
 
Channels - Class in org.apache.crunch.lib
Utilities for splitting Pair instances emitted by DoFn into separate PCollection instances.
Channels() - Constructor for class org.apache.crunch.lib.Channels
 
checkCombiningSpecificAndReflectionSchemas() - Static method in class org.apache.crunch.types.avro.Avros
 
checkOutputSpecs(JobContext) - Static method in class org.apache.crunch.io.CrunchOutputs
 
ClassloaderFallbackObjectInputStream - Class in org.apache.crunch.util
A custom ObjectInputStream that falls back to the thread context classloader if the class can't be found with the usual classloader that ObjectInputStream uses.
ClassloaderFallbackObjectInputStream(InputStream) - Constructor for class org.apache.crunch.util.ClassloaderFallbackObjectInputStream
 
cleanup(Emitter<Pair<String, BloomFilter>>) - Method in class org.apache.crunch.contrib.bloomfilter.BloomFilterFn
 
cleanup(Emitter<T>) - Method in class org.apache.crunch.DoFn
Called during the cleanup of the MapReduce job this DoFn is associated with.
cleanup(Emitter<T>) - Method in class org.apache.crunch.FilterFn
 
cleanup() - Method in class org.apache.crunch.FilterFn
Called during the cleanup of the MapReduce job this FilterFn is associated with.
cleanup(Emitter<T>) - Method in class org.apache.crunch.fn.CompositeMapFn
 
cleanup(Emitter<Pair<S, T>>) - Method in class org.apache.crunch.fn.PairMapFn
 
cleanup(boolean) - Method in class org.apache.crunch.impl.dist.DistributedPipeline
 
cleanup(boolean) - Method in class org.apache.crunch.impl.mem.MemPipeline
 
cleanup(Emitter<Pair<Integer, Pair<K, V>>>) - Method in class org.apache.crunch.lib.Aggregate.TopKFn
 
cleanup(Emitter<Pair<K, Pair<U, V>>>) - Method in class org.apache.crunch.lib.join.FullOuterJoinFn
Called during the cleanup of the MapReduce job this DoFn is associated with.
cleanup(Emitter<Pair<K, Pair<U, V>>>) - Method in class org.apache.crunch.lib.join.LeftOuterJoinFn
Called during the cleanup of the MapReduce job this DoFn is associated with.
cleanup(boolean) - Method in interface org.apache.crunch.Pipeline
Cleans up any artifacts created as a result of running the pipeline.
clear() - Method in class org.apache.crunch.types.writable.TupleWritable
 
clearAge() - Method in class org.apache.crunch.test.Person.Builder
Clears the value of the 'age' field
clearCounters() - Static method in class org.apache.crunch.impl.mem.MemPipeline
 
clearCounters() - Static method in class org.apache.crunch.test.TestCounters
 
clearDepartment() - Method in class org.apache.crunch.test.Employee.Builder
Clears the value of the 'department' field
clearName() - Method in class org.apache.crunch.test.Employee.Builder
Clears the value of the 'name' field
clearName() - Method in class org.apache.crunch.test.Person.Builder
Clears the value of the 'name' field
clearSalary() - Method in class org.apache.crunch.test.Employee.Builder
Clears the value of the 'salary' field
clearSiblingnames() - Method in class org.apache.crunch.test.Person.Builder
Clears the value of the 'siblingnames' field
close() - Method in class org.apache.crunch.io.CrunchOutputs
 
close() - Method in class org.apache.crunch.kafka.inputformat.KafkaRecordReader
 
close() - Method in class org.apache.crunch.kafka.KafkaSource.BytesDeserializer
 
close() - Method in class org.apache.crunch.kafka.offset.hdfs.HDFSOffsetReader
 
close() - Method in class org.apache.crunch.kafka.offset.hdfs.HDFSOffsetWriter
 
cogroup(PTable<K, U>) - Method in class org.apache.crunch.impl.dist.collect.PTableBase
 
cogroup(LTable<K, U>) - Method in interface org.apache.crunch.lambda.LTable
Cogroup this table with another LTable with the same key type, collecting the set of values from each side.
Cogroup - Class in org.apache.crunch.lib
 
Cogroup() - Constructor for class org.apache.crunch.lib.Cogroup
 
cogroup(PTable<K, U>, PTable<K, V>) - Static method in class org.apache.crunch.lib.Cogroup
Co-groups the two PTable arguments.
cogroup(int, PTable<K, U>, PTable<K, V>) - Static method in class org.apache.crunch.lib.Cogroup
Co-groups the two PTable arguments with a user-specified degree of parallelism (a.k.a, number of reducers.)
cogroup(PTable<K, V1>, PTable<K, V2>, PTable<K, V3>) - Static method in class org.apache.crunch.lib.Cogroup
Co-groups the three PTable arguments.
cogroup(int, PTable<K, V1>, PTable<K, V2>, PTable<K, V3>) - Static method in class org.apache.crunch.lib.Cogroup
Co-groups the three PTable arguments with a user-specified degree of parallelism (a.k.a, number of reducers.)
cogroup(PTable<K, V1>, PTable<K, V2>, PTable<K, V3>, PTable<K, V4>) - Static method in class org.apache.crunch.lib.Cogroup
Co-groups the three PTable arguments.
cogroup(int, PTable<K, V1>, PTable<K, V2>, PTable<K, V3>, PTable<K, V4>) - Static method in class org.apache.crunch.lib.Cogroup
Co-groups the three PTable arguments with a user-specified degree of parallelism (a.k.a, number of reducers.)
cogroup(PTable<K, ?>, PTable<K, ?>...) - Static method in class org.apache.crunch.lib.Cogroup
Co-groups an arbitrary number of PTable arguments.
cogroup(int, PTable<K, ?>, PTable<K, ?>...) - Static method in class org.apache.crunch.lib.Cogroup
Co-groups an arbitrary number of PTable arguments with a user-specified degree of parallelism (a.k.a, number of reducers.) The largest table should come last in the ordering.
cogroup(PTable<K, U>) - Method in interface org.apache.crunch.PTable
Co-group operation with the given table.
Collect(Collection<V1>, Collection<V2>, Collection<V3>) - Constructor for class org.apache.crunch.Tuple3.Collect
 
Collect(Collection<V1>, Collection<V2>, Collection<V3>, Collection<V4>) - Constructor for class org.apache.crunch.Tuple4.Collect
 
collectAllValues() - Method in interface org.apache.crunch.lambda.LGroupedTable
Collect all values for each key into a Collection
CollectionDeepCopier<T> - Class in org.apache.crunch.types
Performs deep copies (based on underlying PType deep copying) of Collections.
CollectionDeepCopier(PType<T>) - Constructor for class org.apache.crunch.types.CollectionDeepCopier
 
collectionOf(T...) - Static method in class org.apache.crunch.impl.mem.MemPipeline
 
collectionOf(Iterable<T>) - Static method in class org.apache.crunch.impl.mem.MemPipeline
 
collections(PType<T>) - Static method in class org.apache.crunch.types.avro.Avros
 
collections(PType<T>) - Method in class org.apache.crunch.types.avro.AvroTypeFamily
 
collections(PType<T>) - Method in interface org.apache.crunch.types.PTypeFamily
 
collections(PType<T>) - Static method in class org.apache.crunch.types.writable.Writables
 
collections(PType<T>) - Method in class org.apache.crunch.types.writable.WritableTypeFamily
 
collectUniqueValues() - Method in interface org.apache.crunch.lambda.LGroupedTable
Collect all unique values for each key into a Collection (note that the value type must have a correctly- defined equals() and hashcode().
collectValues() - Method in class org.apache.crunch.impl.dist.collect.PTableBase
 
collectValues(SSupplier<C>, SBiConsumer<C, V>, PType<C>) - Method in interface org.apache.crunch.lambda.LGroupedTable
Collect the values into an aggregate type.
collectValues(PTable<K, V>) - Static method in class org.apache.crunch.lib.Aggregate
 
collectValues() - Method in interface org.apache.crunch.PTable
Aggregate all of the values with the same key into a single key-value pair in the returned PTable.
column() - Method in class org.apache.crunch.lib.Sort.ColumnOrder
 
ColumnOrder(int, Sort.Order) - Constructor for class org.apache.crunch.lib.Sort.ColumnOrder
 
CombineFn<S,T> - Class in org.apache.crunch
A special DoFn implementation that converts an Iterable of values into a single value.
CombineFn() - Constructor for class org.apache.crunch.CombineFn
 
CombineMapsideFunction<K,V> - Class in org.apache.crunch.impl.spark.fn
 
CombineMapsideFunction(CombineFn<K, V>, SparkRuntimeContext) - Constructor for class org.apache.crunch.impl.spark.fn.CombineMapsideFunction
 
combineValues(CombineFn<K, V>, CombineFn<K, V>) - Method in class org.apache.crunch.impl.dist.collect.BaseGroupedTable
 
combineValues(CombineFn<K, V>) - Method in class org.apache.crunch.impl.dist.collect.BaseGroupedTable
 
combineValues(Aggregator<V>) - Method in class org.apache.crunch.impl.dist.collect.BaseGroupedTable
 
combineValues(Aggregator<V>, Aggregator<V>) - Method in class org.apache.crunch.impl.dist.collect.BaseGroupedTable
 
combineValues(Aggregator<V>) - Method in interface org.apache.crunch.lambda.LGroupedTable
Combine the value part of the table using the provided Crunch Aggregator.
combineValues(SSupplier<A>, SBiFunction<A, V, A>, SFunction<A, Iterable<V>>) - Method in interface org.apache.crunch.lambda.LGroupedTable
Combine the value part of the table using the given functions.
combineValues(CombineFn<K, V>) - Method in interface org.apache.crunch.PGroupedTable
Combines the values of this grouping using the given CombineFn.
combineValues(CombineFn<K, V>, CombineFn<K, V>) - Method in interface org.apache.crunch.PGroupedTable
Combines and reduces the values of this grouping using the given CombineFn instances.
combineValues(Aggregator<V>) - Method in interface org.apache.crunch.PGroupedTable
Combine the values in each group using the given Aggregator.
combineValues(Aggregator<V>, Aggregator<V>) - Method in interface org.apache.crunch.PGroupedTable
Combine and reduces the values in each group using the given Aggregator instances.
comm(PCollection<T>, PCollection<T>) - Static method in class org.apache.crunch.lib.Set
Find the elements that are common to two sets, like the Unix comm utility.
Comparator() - Constructor for class org.apache.crunch.types.writable.TupleWritable.Comparator
 
compare(ByteArray, ByteArray) - Method in class org.apache.crunch.impl.spark.SparkComparator
 
compare(Pair<K, V>, Pair<K, V>) - Method in class org.apache.crunch.lib.Aggregate.PairValueComparator
 
compare(AvroWrapper<T>, AvroWrapper<T>) - Method in class org.apache.crunch.lib.join.JoinUtils.AvroPairGroupingComparator
 
compare(byte[], int, int, byte[], int, int) - Method in class org.apache.crunch.lib.join.JoinUtils.AvroPairGroupingComparator
 
compare(TupleWritable, TupleWritable) - Method in class org.apache.crunch.lib.join.JoinUtils.TupleWritableComparator
 
compare(byte[], int, int, byte[], int, int) - Method in class org.apache.crunch.lib.join.JoinUtils.TupleWritableComparator
 
compare(AvroKey<T>, AvroKey<T>) - Method in class org.apache.crunch.lib.sort.ReverseAvroComparator
 
compare(byte[], int, int, byte[], int, int) - Method in class org.apache.crunch.lib.sort.ReverseAvroComparator
 
compare(byte[], int, int, byte[], int, int) - Method in class org.apache.crunch.lib.sort.ReverseWritableComparator
 
compare(T, T) - Method in class org.apache.crunch.lib.sort.ReverseWritableComparator
 
compare(WritableComparable, WritableComparable) - Method in class org.apache.crunch.lib.sort.TupleWritableComparator
 
compare(byte[], int, int, byte[], int, int) - Method in class org.apache.crunch.types.writable.TupleWritable.Comparator
 
compare(WritableComparable, WritableComparable) - Method in class org.apache.crunch.types.writable.TupleWritable.Comparator
 
compareTo(ByteArray) - Method in class org.apache.crunch.impl.spark.ByteArray
 
compareTo(Offsets.PartitionOffset) - Method in class org.apache.crunch.kafka.offset.hdfs.Offsets.PartitionOffset
 
compareTo(Pair<K, V>) - Method in class org.apache.crunch.Pair
 
compareTo(TupleWritable) - Method in class org.apache.crunch.types.writable.TupleWritable
 
compareTo(UnionWritable) - Method in class org.apache.crunch.types.writable.UnionWritable
 
CompositeMapFn<R,S,T> - Class in org.apache.crunch.fn
 
CompositeMapFn(MapFn<R, S>, MapFn<S, T>) - Constructor for class org.apache.crunch.fn.CompositeMapFn
 
CompositePathIterable<T> - Class in org.apache.crunch.io
 
Compress - Class in org.apache.crunch.io
Helper functions for compressing output data.
Compress() - Constructor for class org.apache.crunch.io.Compress
 
compress(T, Class<? extends CompressionCodec>) - Static method in class org.apache.crunch.io.Compress
Configure the given output target to be compressed using the given codec.
conf(String, String) - Method in class org.apache.crunch.GroupingOptions.Builder
 
conf(String, String) - Method in class org.apache.crunch.ParallelDoOptions.Builder
Specifies key-value pairs that should be added to the Configuration object associated with the Job that includes these options.
conf(String, String) - Method in interface org.apache.crunch.SourceTarget
Adds the given key-value pair to the Configuration instance(s) that are used to read and write this SourceTarget<T>.
configure(Configuration) - Method in class org.apache.crunch.DoFn
Configure this DoFn.
configure(Configuration) - Method in class org.apache.crunch.fn.CompositeMapFn
 
configure(Configuration) - Method in class org.apache.crunch.fn.ExtractKeyFn
 
configure(Configuration) - Method in class org.apache.crunch.fn.PairMapFn
 
configure(Job) - Method in class org.apache.crunch.GroupingOptions
 
configure(Configuration) - Method in class org.apache.crunch.io.FormatBundle
 
configure(Target, PType<?>) - Method in interface org.apache.crunch.io.OutputHandler
 
configure(Map<String, ?>, boolean) - Method in class org.apache.crunch.kafka.KafkaSource.BytesDeserializer
 
configure(Configuration) - Method in class org.apache.crunch.ParallelDoOptions
Applies the key-value pairs that were associated with this instance to the given Configuration object.
configure(Configuration) - Method in interface org.apache.crunch.ReadableData
Allows this instance to specify any additional configuration settings that may be needed by the job that it is launched in.
configure(FormatBundle) - Method in class org.apache.crunch.types.avro.AvroMode
Populates the bundle with mode specific settings for the specific FormatBundle.
configure(Configuration) - Method in class org.apache.crunch.types.avro.AvroMode
Populates the conf with mode specific settings.
configure(Configuration) - Method in class org.apache.crunch.types.avro.AvroUtf8InputFormat
 
configure(Configuration) - Method in class org.apache.crunch.types.PGroupedTableType.PairIterableMapFn
 
configure(Configuration) - Method in class org.apache.crunch.util.DelegatingReadableData
 
configure(Configuration) - Method in class org.apache.crunch.util.UnionReadableData
 
configureFactory(Configuration) - Method in class org.apache.crunch.types.avro.AvroMode
configureForMapReduce(Job, PType<?>, Path, String) - Method in interface org.apache.crunch.io.MapReduceTarget
 
configureOrdering(Configuration, WritableType[], Sort.ColumnOrder[]) - Static method in class org.apache.crunch.lib.sort.TupleWritableComparator
 
configureReflectDataFactory(Configuration) - Static method in class org.apache.crunch.types.avro.Avros
Deprecated.
as of 0.9.0; use AvroMode.REFLECT.configure(Configuration)
configureShuffle(Configuration) - Method in class org.apache.crunch.types.avro.AvroMode
Populates the conf with mode specific settings for use during the shuffle phase.
configureShuffle(Job, GroupingOptions) - Method in class org.apache.crunch.types.PGroupedTableType
 
configureSource(Job, int) - Method in class org.apache.crunch.kafka.KafkaSource
 
configureSource(Job, int) - Method in interface org.apache.crunch.Source
Configure the given job to use this source as an input.
CONSUMER_POLL_TIMEOUT_DEFAULT - Static variable in class org.apache.crunch.kafka.KafkaSource
Default timeout value for KafkaSource.CONSUMER_POLL_TIMEOUT_KEY of 1 second.
CONSUMER_POLL_TIMEOUT_KEY - Static variable in class org.apache.crunch.kafka.KafkaSource
Constant to indicate how long the reader waits before timing out when retrieving data from Kafka.
containers(Class<T>) - Static method in class org.apache.crunch.types.avro.Avros
 
containers(Class<T>) - Method in class org.apache.crunch.types.avro.AvroTypeFamily
 
convert(Object, ObjectInspector, ObjectInspector) - Static method in class org.apache.crunch.types.orc.OrcUtils
Convert an object from / to OrcStruct
convert(PType<T>, PTypeFamily) - Static method in class org.apache.crunch.types.PTypeUtils
 
Converter<K,V,S,T> - Interface in org.apache.crunch.types
Converts the input key/value from a MapReduce task into the input to a DoFn, or takes the output of a DoFn and write it to the output key/values.
convertInput(K, V) - Method in interface org.apache.crunch.types.Converter
 
convertIterableInput(K, Iterable<V>) - Method in interface org.apache.crunch.types.Converter
 
copyResourceFile(String) - Method in class org.apache.crunch.test.TemporaryPath
Copy a classpath resource to File.
copyResourceFileName(String) - Method in class org.apache.crunch.test.TemporaryPath
Copy a classpath resource returning its absolute file name.
copyResourcePath(String) - Method in class org.apache.crunch.test.TemporaryPath
Copy a classpath resource to a Path.
count() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
count() - Method in interface org.apache.crunch.lambda.LCollection
Count distict values in this LCollection, yielding an LTable mapping each value to the number of occurrences in the collection.
count(PCollection<S>) - Static method in class org.apache.crunch.lib.Aggregate
Returns a PTable that contains the unique elements of this collection mapped to a count of their occurrences.
count(PCollection<S>, int) - Static method in class org.apache.crunch.lib.Aggregate
Returns a PTable that contains the unique elements of this collection mapped to a count of their occurrences.
count - Variable in class org.apache.crunch.lib.Quantiles.Result
 
count() - Method in interface org.apache.crunch.PCollection
Returns a PTable instance that contains the counts of each unique element of this PCollection.
countClause - Variable in class org.apache.crunch.contrib.io.jdbc.DataBaseSource.Builder
 
CounterAccumulatorParam - Class in org.apache.crunch.impl.spark
 
CounterAccumulatorParam() - Constructor for class org.apache.crunch.impl.spark.CounterAccumulatorParam
 
create(String) - Method in class org.apache.crunch.contrib.text.TokenizerFactory
Return a Scanner instance that wraps the input string and uses the delimiter, skip, and locale settings for this TokenizerFactory instance.
create(Iterable<S>, PType<S>) - Method in class org.apache.crunch.impl.dist.DistributedPipeline
 
create(Iterable<S>, PType<S>, CreateOptions) - Method in class org.apache.crunch.impl.dist.DistributedPipeline
 
create(Iterable<Pair<K, V>>, PTableType<K, V>) - Method in class org.apache.crunch.impl.dist.DistributedPipeline
 
create(Iterable<Pair<K, V>>, PTableType<K, V>, CreateOptions) - Method in class org.apache.crunch.impl.dist.DistributedPipeline
 
create(Iterable<T>, PType<T>) - Method in class org.apache.crunch.impl.mem.MemPipeline
 
create(Iterable<T>, PType<T>, CreateOptions) - Method in class org.apache.crunch.impl.mem.MemPipeline
 
create(Iterable<Pair<K, V>>, PTableType<K, V>) - Method in class org.apache.crunch.impl.mem.MemPipeline
 
create(Iterable<Pair<K, V>>, PTableType<K, V>, CreateOptions) - Method in class org.apache.crunch.impl.mem.MemPipeline
 
create(PType<?>, Configuration) - Static method in class org.apache.crunch.impl.spark.serde.SerDeFactory
 
create(Iterable<S>, PType<S>, CreateOptions) - Method in class org.apache.crunch.impl.spark.SparkPipeline
 
create(Iterable<Pair<K, V>>, PTableType<K, V>, CreateOptions) - Method in class org.apache.crunch.impl.spark.SparkPipeline
 
create(FileSystem, Path, FileReaderFactory<S>) - Static method in class org.apache.crunch.io.CompositePathIterable
 
create() - Static method in class org.apache.crunch.lib.join.MapsideJoinStrategy
Create a new MapsideJoinStrategy instance that will load its left-side table into memory, and will materialize the contents of the left-side table to disk before running the in-memory join.
create(boolean) - Static method in class org.apache.crunch.lib.join.MapsideJoinStrategy
Create a new MapsideJoinStrategy instance that will load its left-side table into memory.
create(Iterable<T>, PType<T>) - Method in interface org.apache.crunch.Pipeline
Creates a PCollection containing the values found in the given Iterable using an implementation-specific distribution mechanism.
create(Iterable<T>, PType<T>, CreateOptions) - Method in interface org.apache.crunch.Pipeline
Creates a PCollection containing the values found in the given Iterable using an implementation-specific distribution mechanism.
create(Iterable<Pair<K, V>>, PTableType<K, V>) - Method in interface org.apache.crunch.Pipeline
Creates a PTable containing the values found in the given Iterable using an implementation-specific distribution mechanism.
create(Iterable<Pair<K, V>>, PTableType<K, V>, CreateOptions) - Method in interface org.apache.crunch.Pipeline
Creates a PTable containing the values found in the given Iterable using an implementation-specific distribution mechanism.
create() - Method in class org.apache.crunch.test.TemporaryPath
 
create() - Static method in class org.apache.crunch.types.NoOpDeepCopier
Static factory method.
create(Object...) - Method in class org.apache.crunch.types.orc.TupleObjectInspector
 
create(Class<T>, Class...) - Static method in class org.apache.crunch.types.TupleFactory
 
createBinarySerde(TypeInfo) - Static method in class org.apache.crunch.types.orc.OrcUtils
Create a binary serde for OrcStruct serialization/deserialization
CreatedCollection<T> - Class in org.apache.crunch.impl.spark.collect
Represents a Spark-based PCollection that was created from a Java Iterable of values.
CreatedCollection(SparkPipeline, Iterable<T>, PType<T>, CreateOptions) - Constructor for class org.apache.crunch.impl.spark.collect.CreatedCollection
 
createDoCollection(String, PCollectionImpl<S>, DoFn<S, T>, PType<T>, ParallelDoOptions) - Method in interface org.apache.crunch.impl.dist.collect.PCollectionFactory
 
createDoCollection(String, PCollectionImpl<S>, DoFn<S, T>, PType<T>, ParallelDoOptions) - Method in class org.apache.crunch.impl.spark.collect.SparkCollectFactory
 
createDoNode() - Method in interface org.apache.crunch.impl.dist.collect.MRCollection
 
createDoTable(String, PCollectionImpl<S>, DoFn<S, Pair<K, V>>, PTableType<K, V>, ParallelDoOptions) - Method in interface org.apache.crunch.impl.dist.collect.PCollectionFactory
 
createDoTable(String, PCollectionImpl<S>, CombineFn<K, V>, DoFn<S, Pair<K, V>>, PTableType<K, V>) - Method in interface org.apache.crunch.impl.dist.collect.PCollectionFactory
 
createDoTable(String, PCollectionImpl<S>, DoFn<S, Pair<K, V>>, PTableType<K, V>, ParallelDoOptions) - Method in class org.apache.crunch.impl.spark.collect.SparkCollectFactory
 
createDoTable(String, PCollectionImpl<S>, CombineFn<K, V>, DoFn<S, Pair<K, V>>, PTableType<K, V>) - Method in class org.apache.crunch.impl.spark.collect.SparkCollectFactory
 
CreatedTable<K,V> - Class in org.apache.crunch.impl.spark.collect
Represents a Spark-based PTable that was created from a Java Iterable of key-value pairs.
CreatedTable(SparkPipeline, Iterable<Pair<K, V>>, PTableType<K, V>, CreateOptions) - Constructor for class org.apache.crunch.impl.spark.collect.CreatedTable
 
createFilter(Path, BloomFilterFn<String>) - Static method in class org.apache.crunch.contrib.bloomfilter.BloomFilterFactory
The method will take an input path and generates BloomFilters for all text files in that path.
createFilter(PCollection<T>, BloomFilterFn<T>) - Static method in class org.apache.crunch.contrib.bloomfilter.BloomFilterFactory
 
createGroupedTable(PTableBase<K, V>, GroupingOptions) - Method in interface org.apache.crunch.impl.dist.collect.PCollectionFactory
 
createGroupedTable(PTableBase<K, V>, GroupingOptions) - Method in class org.apache.crunch.impl.spark.collect.SparkCollectFactory
 
createInputCollection(Source<S>, String, DistributedPipeline, ParallelDoOptions) - Method in interface org.apache.crunch.impl.dist.collect.PCollectionFactory
 
createInputCollection(Source<S>, String, DistributedPipeline, ParallelDoOptions) - Method in class org.apache.crunch.impl.spark.collect.SparkCollectFactory
 
createInputTable(TableSource<K, V>, String, DistributedPipeline, ParallelDoOptions) - Method in interface org.apache.crunch.impl.dist.collect.PCollectionFactory
 
createInputTable(TableSource<K, V>, String, DistributedPipeline, ParallelDoOptions) - Method in class org.apache.crunch.impl.spark.collect.SparkCollectFactory
 
createIntermediateOutput(PType<T>) - Method in class org.apache.crunch.impl.dist.DistributedPipeline
 
CreateOptions - Class in org.apache.crunch
Additional options that can be specified when creating a new PCollection using Pipeline.create(java.lang.Iterable<T>, org.apache.crunch.types.PType<T>).
createOrcStruct(TypeInfo, Object...) - Static method in class org.apache.crunch.types.orc.OrcUtils
Create an object of OrcStruct given a type string and a list of objects
createOrderedTupleSchema(PType<S>, Sort.ColumnOrder[]) - Static method in class org.apache.crunch.lib.sort.SortFns
Constructs an Avro schema for the given PType<S> that respects the given column orderings.
createPut(PTable<String, String>) - Method in class org.apache.crunch.examples.WordAggregationHBase
Create puts in order to insert them in hbase.
createRecordReader(InputSplit, TaskAttemptContext) - Method in class org.apache.crunch.kafka.inputformat.KafkaInputFormat
 
createRecordReader(InputSplit, TaskAttemptContext) - Method in class org.apache.crunch.types.avro.AvroInputFormat
 
createRecordReader(InputSplit, TaskAttemptContext) - Method in class org.apache.crunch.types.avro.AvroUtf8InputFormat
 
createSourceTarget(Configuration, Path, Iterable<T>, int) - Method in class org.apache.crunch.types.avro.AvroType
 
createSourceTarget(Configuration, Path, Iterable<T>, int) - Method in interface org.apache.crunch.types.PType
Returns a ReadableSource that contains the data in the given Iterable.
createSourceTarget(Configuration, Path, Iterable<T>, int) - Method in class org.apache.crunch.types.writable.WritableType
 
createTempPath() - Method in class org.apache.crunch.impl.dist.DistributedPipeline
 
createUnionCollection(List<? extends PCollectionImpl<S>>) - Method in interface org.apache.crunch.impl.dist.collect.PCollectionFactory
 
createUnionCollection(List<? extends PCollectionImpl<S>>) - Method in class org.apache.crunch.impl.spark.collect.SparkCollectFactory
 
createUnionTable(List<PTableBase<K, V>>) - Method in interface org.apache.crunch.impl.dist.collect.PCollectionFactory
 
createUnionTable(List<PTableBase<K, V>>) - Method in class org.apache.crunch.impl.spark.collect.SparkCollectFactory
 
cross(PTable<K1, U>, PTable<K2, V>) - Static method in class org.apache.crunch.lib.Cartesian
Performs a full cross join on the specified PTables (using the same strategy as Pig's CROSS operator).
cross(PTable<K1, U>, PTable<K2, V>, int) - Static method in class org.apache.crunch.lib.Cartesian
Performs a full cross join on the specified PTables (using the same strategy as Pig's CROSS operator).
cross(PCollection<U>, PCollection<V>) - Static method in class org.apache.crunch.lib.Cartesian
Performs a full cross join on the specified PCollections (using the same strategy as Pig's CROSS operator).
cross(PCollection<U>, PCollection<V>, int) - Static method in class org.apache.crunch.lib.Cartesian
Performs a full cross join on the specified PCollections (using the same strategy as Pig's CROSS operator).
CRUNCH_DISABLE_OUTPUT_COUNTERS - Static variable in class org.apache.crunch.io.CrunchOutputs
 
CRUNCH_FILTER_NAME - Static variable in class org.apache.crunch.contrib.bloomfilter.BloomFilterFn
 
CRUNCH_FILTER_SIZE - Static variable in class org.apache.crunch.contrib.bloomfilter.BloomFilterFn
 
CRUNCH_INPUTS - Static variable in class org.apache.crunch.io.CrunchInputs
 
CRUNCH_OUTPUTS - Static variable in class org.apache.crunch.io.CrunchOutputs
 
CrunchInputs - Class in org.apache.crunch.io
Helper functions for configuring multiple InputFormat instances within a single Crunch MapReduce job.
CrunchInputs() - Constructor for class org.apache.crunch.io.CrunchInputs
 
CrunchIterable<S,T> - Class in org.apache.crunch.impl.spark.fn
 
CrunchIterable(DoFn<S, T>, Iterator<S>) - Constructor for class org.apache.crunch.impl.spark.fn.CrunchIterable
 
CrunchOutputs<K,V> - Class in org.apache.crunch.io
An analogue of CrunchInputs for handling multiple OutputFormat instances writing to multiple files within a single MapReduce job.
CrunchOutputs(TaskInputOutputContext<?, ?, K, V>) - Constructor for class org.apache.crunch.io.CrunchOutputs
Creates and initializes multiple outputs support, it should be instantiated in the Mapper/Reducer setup method.
CrunchOutputs(Configuration) - Constructor for class org.apache.crunch.io.CrunchOutputs
 
CrunchOutputs.OutputConfig<K,V> - Class in org.apache.crunch.io
 
CrunchPairTuple2<K,V> - Class in org.apache.crunch.impl.spark.fn
 
CrunchPairTuple2() - Constructor for class org.apache.crunch.impl.spark.fn.CrunchPairTuple2
 
CrunchRuntimeException - Exception in org.apache.crunch
A RuntimeException implementation that includes some additional options for the Crunch execution engine to track reporting status.
CrunchRuntimeException(String) - Constructor for exception org.apache.crunch.CrunchRuntimeException
 
CrunchRuntimeException(Exception) - Constructor for exception org.apache.crunch.CrunchRuntimeException
 
CrunchRuntimeException(String, Exception) - Constructor for exception org.apache.crunch.CrunchRuntimeException
 
CrunchTestSupport - Class in org.apache.crunch.test
A temporary workaround for Scala tests to use when working with Rule annotations until it gets fixed in JUnit 4.11.
CrunchTestSupport() - Constructor for class org.apache.crunch.test.CrunchTestSupport
 
CrunchTool - Class in org.apache.crunch.util
An extension of the Tool interface that creates a Pipeline instance and provides methods for working with the Pipeline from inside of the Tool's run method.
CrunchTool() - Constructor for class org.apache.crunch.util.CrunchTool
 
CrunchTool(boolean) - Constructor for class org.apache.crunch.util.CrunchTool
 

D

DataBaseSource<T extends org.apache.hadoop.mapreduce.lib.db.DBWritable & org.apache.hadoop.io.Writable> - Class in org.apache.crunch.contrib.io.jdbc
Source from reading from a database via a JDBC connection.
DataBaseSource.Builder<T extends org.apache.hadoop.mapreduce.lib.db.DBWritable & org.apache.hadoop.io.Writable> - Class in org.apache.crunch.contrib.io.jdbc
 
DebugLogging - Class in org.apache.crunch.test
Allows direct manipulation of the Hadoop log4j settings to aid in unit testing.
DeepCopier<T> - Interface in org.apache.crunch.types
Performs deep copies of values.
deepCopy(Object) - Method in class org.apache.crunch.types.avro.AvroDerivedValueDeepCopier
 
deepCopy(Collection<T>) - Method in class org.apache.crunch.types.CollectionDeepCopier
 
deepCopy(T) - Method in interface org.apache.crunch.types.DeepCopier
Create a deep copy of a value.
deepCopy(Map<String, T>) - Method in class org.apache.crunch.types.MapDeepCopier
 
deepCopy(T) - Method in class org.apache.crunch.types.NoOpDeepCopier
 
deepCopy(T) - Method in class org.apache.crunch.types.TupleDeepCopier
 
deepCopy(Union) - Method in class org.apache.crunch.types.UnionDeepCopier
 
deepCopy(T) - Method in class org.apache.crunch.types.writable.WritableDeepCopier
 
DEFAULT - Static variable in class org.apache.crunch.CachingOptions
An instance of CachingOptions with the default caching settings.
DEFAULT_BYTES_PER_REDUCE_TASK - Static variable in class org.apache.crunch.util.PartitionUtils
 
DEFAULT_MAX_REDUCERS - Static variable in class org.apache.crunch.util.PartitionUtils
 
DEFAULT_PATH - Static variable in class org.apache.crunch.lib.sort.TotalOrderPartitioner
 
DefaultJoinStrategy<K,U,V> - Class in org.apache.crunch.lib.join
Default join strategy that simply sends all data through the map, shuffle, and reduce phase.
DefaultJoinStrategy() - Constructor for class org.apache.crunch.lib.join.DefaultJoinStrategy
 
DefaultJoinStrategy(int) - Constructor for class org.apache.crunch.lib.join.DefaultJoinStrategy
 
DelegatingReadableData<S,T> - Class in org.apache.crunch.util
Implements the ReadableData<T> interface by delegating to an ReadableData<S> instance and passing its contents through a DoFn<S, T>.
DelegatingReadableData(ReadableData<S>, DoFn<S, T>) - Constructor for class org.apache.crunch.util.DelegatingReadableData
 
delete() - Method in class org.apache.crunch.test.TemporaryPath
 
delimiter(String) - Method in class org.apache.crunch.contrib.text.TokenizerFactory.Builder
Sets the delimiter used by the TokenizerFactory instances constructed by this instance.
department - Variable in class org.apache.crunch.test.Employee
Deprecated.
dependsOn(String, Target) - Method in class org.apache.crunch.PipelineCallable
Requires that the given Target exists before this instance may be executed.
dependsOn(String, PCollection<?>) - Method in class org.apache.crunch.PipelineCallable
Requires that the given PCollection be materialized to disk before this instance may be executed.
derived(PType<V1>, PType<V2>, PType<V3>) - Static method in class org.apache.crunch.Tuple3.Collect
 
derived(PType<V1>, PType<V2>, PType<V3>, PType<V4>) - Static method in class org.apache.crunch.Tuple4.Collect
 
derived(Class<T>, MapFn<S, T>, MapFn<T, S>, PType<S>) - Static method in class org.apache.crunch.types.avro.Avros
 
derived(Class<T>, MapFn<S, T>, MapFn<T, S>, PType<S>) - Method in class org.apache.crunch.types.avro.AvroTypeFamily
 
derived(Class<T>, MapFn<S, T>, MapFn<T, S>, PType<S>) - Method in interface org.apache.crunch.types.PTypeFamily
 
derived(Class<T>, MapFn<S, T>, MapFn<T, S>, PType<S>) - Static method in class org.apache.crunch.types.writable.Writables
 
derived(Class<T>, MapFn<S, T>, MapFn<T, S>, PType<S>) - Method in class org.apache.crunch.types.writable.WritableTypeFamily
 
derivedImmutable(Class<T>, MapFn<S, T>, MapFn<T, S>, PType<S>) - Static method in class org.apache.crunch.types.avro.Avros
 
derivedImmutable(Class<T>, MapFn<S, T>, MapFn<T, S>, PType<S>) - Method in class org.apache.crunch.types.avro.AvroTypeFamily
 
derivedImmutable(Class<T>, MapFn<S, T>, MapFn<T, S>, PType<S>) - Method in interface org.apache.crunch.types.PTypeFamily
A derived type whose values are immutable.
derivedImmutable(Class<T>, MapFn<S, T>, MapFn<T, S>, PType<S>) - Static method in class org.apache.crunch.types.writable.Writables
 
derivedImmutable(Class<T>, MapFn<S, T>, MapFn<T, S>, PType<S>) - Method in class org.apache.crunch.types.writable.WritableTypeFamily
 
deserialize(String, byte[]) - Method in class org.apache.crunch.kafka.KafkaSource.BytesDeserializer
 
deserialized(boolean) - Method in class org.apache.crunch.CachingOptions.Builder
 
deserialized() - Method in class org.apache.crunch.CachingOptions
Whether the data should remain deserialized in the cache, which trades off CPU processing time for additional storage overhead.
detach(DoFn<Pair<K, Iterable<V>>, T>, PType<V>) - Static method in class org.apache.crunch.lib.DoFns
"Reduce" DoFn wrapper which detaches the values in the iterable, preventing the unexpected behaviour related to object reuse often observed when using Avro.
difference(PCollection<T>, PCollection<T>) - Static method in class org.apache.crunch.lib.Set
Compute the set difference between two sets of elements.
disableDeepCopy() - Method in class org.apache.crunch.DoFn
By default, Crunch will do a defensive deep copy of the outputs of a DoFn when there are multiple downstream consumers of that item, in order to prevent the downstream functions from making concurrent modifications to data objects.
DIST_CACHE_REPLICATION - Static variable in class org.apache.crunch.util.DistCache
Configuration key for setting the replication factor for files distributed using the Crunch DistCache helper class.
DistCache - Class in org.apache.crunch.util
Provides functions for working with Hadoop's distributed cache.
DistCache() - Constructor for class org.apache.crunch.util.DistCache
 
Distinct - Class in org.apache.crunch.lib
Functions for computing the distinct elements of a PCollection.
distinct(PCollection<S>) - Static method in class org.apache.crunch.lib.Distinct
Construct a new PCollection that contains the unique elements of a given input PCollection.
distinct(PTable<K, V>) - Static method in class org.apache.crunch.lib.Distinct
A PTable<K, V> analogue of the distinct function.
distinct(PCollection<S>, int) - Static method in class org.apache.crunch.lib.Distinct
A distinct operation that gives the client more control over how frequently elements are flushed to disk in order to allow control over performance or memory consumption.
distinct(PTable<K, V>, int) - Static method in class org.apache.crunch.lib.Distinct
A PTable<K, V> analogue of the distinct function.
distributed(PTable<K, V>, double, double...) - Static method in class org.apache.crunch.lib.Quantiles
Calculate a set of quantiles for each key in a numerically-valued table.
DistributedPipeline - Class in org.apache.crunch.impl.dist
 
DistributedPipeline(String, Configuration, PCollectionFactory) - Constructor for class org.apache.crunch.impl.dist.DistributedPipeline
Instantiate with a custom name and configuration.
DoCollection<S> - Class in org.apache.crunch.impl.spark.collect
 
DoFn<S,T> - Class in org.apache.crunch
Base class for all data processing functions in Crunch.
DoFn() - Constructor for class org.apache.crunch.DoFn
 
DoFnIterator<S,T> - Class in org.apache.crunch.util
An Iterator<T> that combines a delegate Iterator<S> and a DoFn<S, T>, generating data by passing the contents of the iterator through the function.
DoFnIterator(Iterator<S>, DoFn<S, T>) - Constructor for class org.apache.crunch.util.DoFnIterator
 
DoFns - Class in org.apache.crunch.lib
 
DoFns() - Constructor for class org.apache.crunch.lib.DoFns
 
done() - Method in class org.apache.crunch.impl.dist.DistributedPipeline
 
done() - Method in class org.apache.crunch.impl.mem.MemPipeline
 
done() - Method in class org.apache.crunch.impl.spark.SparkPipeline
 
done() - Method in interface org.apache.crunch.Pipeline
Run any remaining jobs required to generate outputs and then clean up any intermediate data files that were created in this run or previous calls to run.
DONE - Static variable in class org.apache.crunch.PipelineResult
 
done() - Method in class org.apache.crunch.util.CrunchTool
 
DoTable<K,V> - Class in org.apache.crunch.impl.spark.collect
 
doubles() - Static method in class org.apache.crunch.types.avro.Avros
 
doubles() - Method in class org.apache.crunch.types.avro.AvroTypeFamily
 
doubles() - Method in interface org.apache.crunch.types.PTypeFamily
 
doubles() - Static method in class org.apache.crunch.types.writable.Writables
 
doubles() - Method in class org.apache.crunch.types.writable.WritableTypeFamily
 
drop(Integer...) - Method in class org.apache.crunch.contrib.text.TokenizerFactory.Builder
Drop the specified fields found by the input scanner, counting from zero.

E

element() - Method in interface org.apache.crunch.lambda.LDoFnContext
Get the input element
emit(T) - Method in interface org.apache.crunch.Emitter
Write the emitted value to the next stage of the pipeline.
emit(T) - Method in interface org.apache.crunch.lambda.LDoFnContext
Emit t to the output
Emitter<T> - Interface in org.apache.crunch
Interface for writing outputs from a DoFn.
Employee - Class in org.apache.crunch.test
 
Employee() - Constructor for class org.apache.crunch.test.Employee
Default constructor.
Employee(CharSequence, Integer, CharSequence) - Constructor for class org.apache.crunch.test.Employee
All-args constructor.
Employee.Builder - Class in org.apache.crunch.test
RecordBuilder for Employee instances.
EMPTY - Static variable in class org.apache.crunch.PipelineResult
 
EmptyPCollection<T> - Class in org.apache.crunch.impl.dist.collect
 
EmptyPCollection(DistributedPipeline, PType<T>) - Constructor for class org.apache.crunch.impl.dist.collect.EmptyPCollection
 
emptyPCollection(PType<S>) - Method in class org.apache.crunch.impl.dist.DistributedPipeline
 
emptyPCollection(PType<T>) - Method in class org.apache.crunch.impl.mem.MemPipeline
 
EmptyPCollection<T> - Class in org.apache.crunch.impl.spark.collect
 
EmptyPCollection(DistributedPipeline, PType<T>) - Constructor for class org.apache.crunch.impl.spark.collect.EmptyPCollection
 
emptyPCollection(PType<S>) - Method in class org.apache.crunch.impl.spark.SparkPipeline
 
emptyPCollection(PType<T>) - Method in interface org.apache.crunch.Pipeline
Creates an empty PCollection of the given PType.
EmptyPTable<K,V> - Class in org.apache.crunch.impl.dist.collect
 
EmptyPTable(DistributedPipeline, PTableType<K, V>) - Constructor for class org.apache.crunch.impl.dist.collect.EmptyPTable
 
emptyPTable(PTableType<K, V>) - Method in class org.apache.crunch.impl.dist.DistributedPipeline
 
emptyPTable(PTableType<K, V>) - Method in class org.apache.crunch.impl.mem.MemPipeline
 
EmptyPTable<K,V> - Class in org.apache.crunch.impl.spark.collect
 
EmptyPTable(DistributedPipeline, PTableType<K, V>) - Constructor for class org.apache.crunch.impl.spark.collect.EmptyPTable
 
emptyPTable(PTableType<K, V>) - Method in class org.apache.crunch.impl.spark.SparkPipeline
 
emptyPTable(PTableType<K, V>) - Method in interface org.apache.crunch.Pipeline
Creates an empty PTable of the given PTable Type.
enable(Level) - Static method in class org.apache.crunch.test.DebugLogging
Enables logging Hadoop output to the console using the pattern '%-4r [%t] %-5p %c %x - %m%n' at the specified Level.
enable(Level, Appender) - Static method in class org.apache.crunch.test.DebugLogging
Enables logging to the given Appender at the specified Level.
enableDebug() - Method in class org.apache.crunch.impl.dist.DistributedPipeline
 
enableDebug() - Method in class org.apache.crunch.impl.mem.MemPipeline
 
enableDebug() - Method in interface org.apache.crunch.Pipeline
Turn on debug logging for jobs that are run from this pipeline.
enableDebug() - Method in class org.apache.crunch.util.CrunchTool
 
enums(Class<T>, PTypeFamily) - Static method in class org.apache.crunch.types.PTypes
Constructs a PType for a Java Enum type.
equals(Object) - Method in class org.apache.crunch.impl.dist.collect.BaseInputCollection
 
equals(Object) - Method in class org.apache.crunch.impl.dist.collect.BaseInputTable
 
equals(Object) - Method in class org.apache.crunch.impl.spark.ByteArray
 
equals(Object) - Method in class org.apache.crunch.impl.spark.IntByteArray
 
equals(Object) - Method in class org.apache.crunch.io.FormatBundle
 
equals(Object) - Method in class org.apache.crunch.kafka.offset.hdfs.Offsets
 
equals(Object) - Method in class org.apache.crunch.kafka.offset.hdfs.Offsets.PartitionOffset
 
equals(Object) - Method in class org.apache.crunch.lib.Quantiles.Result
 
equals(Object) - Method in class org.apache.crunch.Pair
 
equals(Object) - Method in class org.apache.crunch.Tuple3
 
equals(Object) - Method in class org.apache.crunch.Tuple4
 
equals(Object) - Method in class org.apache.crunch.TupleN
 
equals(Object) - Method in class org.apache.crunch.types.avro.AvroMode
 
equals(Object) - Method in class org.apache.crunch.types.avro.AvroType
 
equals(Object) - Method in class org.apache.crunch.types.writable.TupleWritable
equals(Object) - Method in class org.apache.crunch.types.writable.WritableType
 
equals(Object) - Method in class org.apache.crunch.Union
 
errorOnLastRecord() - Method in class org.apache.crunch.contrib.text.AbstractCompositeExtractor
 
errorOnLastRecord() - Method in class org.apache.crunch.contrib.text.AbstractSimpleExtractor
 
errorOnLastRecord() - Method in interface org.apache.crunch.contrib.text.Extractor
Returns true if the last call to extract on this instance threw an exception that was handled.
execute() - Method in class org.apache.crunch.impl.spark.SparkRuntime
 
extract(String) - Method in class org.apache.crunch.contrib.text.AbstractCompositeExtractor
 
extract(String) - Method in class org.apache.crunch.contrib.text.AbstractSimpleExtractor
 
extract(String) - Method in interface org.apache.crunch.contrib.text.Extractor
Extract a value with the type of this instance.
extractKey(String) - Static method in class org.apache.crunch.types.Protos
 
ExtractKeyFn<K,V> - Class in org.apache.crunch.fn
Wrapper function for converting a key-from-value extractor MapFn<V, K> into a key-value pair extractor that is used to convert from a PCollection<V> to a PTable<K, V>.
ExtractKeyFn(MapFn<V, K>) - Constructor for class org.apache.crunch.fn.ExtractKeyFn
 
Extractor<T> - Interface in org.apache.crunch.contrib.text
An interface for extracting a specific data type from a text string that is being processed by a Scanner object.
Extractors - Class in org.apache.crunch.contrib.text
Factory methods for constructing common Extractor types.
Extractors() - Constructor for class org.apache.crunch.contrib.text.Extractors
 
ExtractorStats - Class in org.apache.crunch.contrib.text
Records the number of kind of errors that an Extractor encountered when parsing input data.
ExtractorStats(int) - Constructor for class org.apache.crunch.contrib.text.ExtractorStats
 
ExtractorStats(int, List<Integer>) - Constructor for class org.apache.crunch.contrib.text.ExtractorStats
 
extractText(PTable<ImmutableBytesWritable, Result>) - Method in class org.apache.crunch.examples.WordAggregationHBase
Extract information from hbase

F

factory() - Method in interface org.apache.crunch.lambda.LCollection
Get the LCollectionFactory which can be used to create new Ltype instances
FILE_FORMAT_EXTENSION - Static variable in class org.apache.crunch.kafka.offset.hdfs.HDFSOffsetWriter
File extension for storing the offsets.
FILE_FORMATTER - Static variable in class org.apache.crunch.kafka.offset.hdfs.HDFSOffsetWriter
Formatter to use when creating the file names in a URI compliant format.
fileNameToPersistenceTime(String) - Static method in class org.apache.crunch.kafka.offset.hdfs.HDFSOffsetWriter
Converts a fileName into the time the offsets were persisted.
FileNamingScheme - Interface in org.apache.crunch.io
Encapsulates rules for naming output files.
FileReaderFactory<T> - Interface in org.apache.crunch.io
 
filter(FilterFn<S>) - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
filter(String, FilterFn<S>) - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
filter(FilterFn<Pair<K, V>>) - Method in class org.apache.crunch.impl.dist.collect.PTableBase
 
filter(String, FilterFn<Pair<K, V>>) - Method in class org.apache.crunch.impl.dist.collect.PTableBase
 
filter(SPredicate<S>) - Method in interface org.apache.crunch.lambda.LCollection
Filter the collection using the supplied predicate.
filter(SPredicate<Pair<K, V>>) - Method in interface org.apache.crunch.lambda.LTable
Filter the rows of the table using the supplied predicate.
filter(FilterFn<S>) - Method in interface org.apache.crunch.PCollection
Apply the given filter function to this instance and return the resulting PCollection.
filter(String, FilterFn<S>) - Method in interface org.apache.crunch.PCollection
Apply the given filter function to this instance and return the resulting PCollection.
filter(FilterFn<Pair<K, V>>) - Method in interface org.apache.crunch.PTable
Apply the given filter function to this instance and return the resulting PTable.
filter(String, FilterFn<Pair<K, V>>) - Method in interface org.apache.crunch.PTable
Apply the given filter function to this instance and return the resulting PTable.
filterByKey(SPredicate<K>) - Method in interface org.apache.crunch.lambda.LTable
Filter the rows of the table using the supplied predicate applied to the key part of each record.
filterByValue(SPredicate<V>) - Method in interface org.apache.crunch.lambda.LTable
Filter the rows of the table using the supplied predicate applied to the value part of each record.
filterConnectionProperties(Properties) - Static method in class org.apache.crunch.kafka.inputformat.KafkaInputFormat
Filters out Kafka connection properties that were tagged using generateConnectionPropertyKey.
FilterFn<T> - Class in org.apache.crunch
A DoFn for the common case of filtering the members of a PCollection based on a boolean condition.
FilterFn() - Constructor for class org.apache.crunch.FilterFn
 
FilterFns - Class in org.apache.crunch.fn
A collection of pre-defined FilterFn implementations.
filterMap(SFunction<S, Optional<T>>, PType<T>) - Method in interface org.apache.crunch.lambda.LCollection
Combination of a filter and map operation by using a function with Optional return type.
filterMap(SFunction<S, Optional<Pair<K, V>>>, PTableType<K, V>) - Method in interface org.apache.crunch.lambda.LCollection
Combination of a filter and map operation by using a function with Optional return type.
findContainingJar(Class<?>) - Static method in class org.apache.crunch.util.DistCache
Finds the path to a jar that contains the class provided, if any.
findCounter(Enum<?>) - Method in class org.apache.crunch.PipelineResult.StageResult
Deprecated.
The Counter class changed incompatibly between Hadoop 1 and 2 (from a class to an interface) so user programs should avoid this method and use PipelineResult.StageResult.getCounterValue(Enum) and/or PipelineResult.StageResult.getCounterDisplayName(Enum).
findPartition(K) - Method in class org.apache.crunch.lib.sort.TotalOrderPartitioner.BinarySearchNode
 
findPartition(T) - Method in interface org.apache.crunch.lib.sort.TotalOrderPartitioner.Node
Locate partition in keyset K, st [Ki..Ki+1) defines a partition, with implicit K0 = -inf, Kn = +inf, and |K| = #partitions - 1.
first() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
first() - Method in class org.apache.crunch.Pair
 
first() - Method in interface org.apache.crunch.PCollection
 
first() - Method in class org.apache.crunch.Tuple3
 
first() - Method in class org.apache.crunch.Tuple4
 
FIRST_N(int) - Static method in class org.apache.crunch.fn.Aggregators
Return the first n values (or fewer if there are fewer values than n).
flatMap(SFunction<S, Stream<T>>, PType<T>) - Method in interface org.apache.crunch.lambda.LCollection
Map each element to zero or more output elements using the provided stream-returning function.
flatMap(SFunction<S, Stream<Pair<K, V>>>, PTableType<K, V>) - Method in interface org.apache.crunch.lambda.LCollection
Map each element to zero or more output elements using the provided stream-returning function to yield an LTable
FlatMapIndexFn<S,T> - Class in org.apache.crunch.impl.spark.fn
 
FlatMapIndexFn(DoFn<S, T>, boolean, SparkRuntimeContext) - Constructor for class org.apache.crunch.impl.spark.fn.FlatMapIndexFn
 
FlatMapPairDoFn<K,V,T> - Class in org.apache.crunch.impl.spark.fn
 
FlatMapPairDoFn(DoFn<Pair<K, V>, T>, SparkRuntimeContext) - Constructor for class org.apache.crunch.impl.spark.fn.FlatMapPairDoFn
 
floats() - Static method in class org.apache.crunch.types.avro.Avros
 
floats() - Method in class org.apache.crunch.types.avro.AvroTypeFamily
 
floats() - Method in interface org.apache.crunch.types.PTypeFamily
 
floats() - Static method in class org.apache.crunch.types.writable.Writables
 
floats() - Method in class org.apache.crunch.types.writable.WritableTypeFamily
 
flush() - Method in interface org.apache.crunch.Emitter
Flushes any values cached by this emitter.
forAvroSchema(Schema) - Static method in class org.apache.crunch.impl.spark.ByteArrayHelper
 
forInput(Class<T>) - Static method in class org.apache.crunch.io.FormatBundle
 
FormatBundle<K> - Class in org.apache.crunch.io
A combination of an InputFormat or OutputFormat and any extra configuration information that format class needs to run.
FormatBundle() - Constructor for class org.apache.crunch.io.FormatBundle
 
formattedFile(String, Class<? extends FileInputFormat<K, V>>, Class<K>, Class<V>) - Static method in class org.apache.crunch.io.From
Creates a TableSource<K, V> for reading data from files that have custom FileInputFormat<K, V> implementations not covered by the provided TableSource and Source factory methods.
formattedFile(Path, Class<? extends FileInputFormat<K, V>>, Class<K>, Class<V>) - Static method in class org.apache.crunch.io.From
Creates a TableSource<K, V> for reading data from files that have custom FileInputFormat<K, V> implementations not covered by the provided TableSource and Source factory methods.
formattedFile(List<Path>, Class<? extends FileInputFormat<K, V>>, Class<K>, Class<V>) - Static method in class org.apache.crunch.io.From
Creates a TableSource<K, V> for reading data from files that have custom FileInputFormat<K, V> implementations not covered by the provided TableSource and Source factory methods.
formattedFile(String, Class<? extends FileInputFormat<?, ?>>, PType<K>, PType<V>) - Static method in class org.apache.crunch.io.From
Creates a TableSource<K, V> for reading data from files that have custom FileInputFormat implementations not covered by the provided TableSource and Source factory methods.
formattedFile(Path, Class<? extends FileInputFormat<?, ?>>, PType<K>, PType<V>) - Static method in class org.apache.crunch.io.From
Creates a TableSource<K, V> for reading data from files that have custom FileInputFormat implementations not covered by the provided TableSource and Source factory methods.
formattedFile(List<Path>, Class<? extends FileInputFormat<?, ?>>, PType<K>, PType<V>) - Static method in class org.apache.crunch.io.From
Creates a TableSource<K, V> for reading data from files that have custom FileInputFormat implementations not covered by the provided TableSource and Source factory methods.
formattedFile(String, Class<? extends FileOutputFormat<K, V>>) - Static method in class org.apache.crunch.io.To
Creates a Target at the given path name that writes data to a custom FileOutputFormat.
formattedFile(Path, Class<? extends FileOutputFormat<K, V>>) - Static method in class org.apache.crunch.io.To
Creates a Target at the given Path that writes data to a custom FileOutputFormat.
forOutput(Class<T>) - Static method in class org.apache.crunch.io.FormatBundle
 
fourth() - Method in class org.apache.crunch.Tuple4
 
From - Class in org.apache.crunch.io
Static factory methods for creating common Source types.
From() - Constructor for class org.apache.crunch.io.From
 
fromBytes(byte[]) - Method in class org.apache.crunch.impl.spark.serde.AvroSerDe
 
fromBytes(byte[]) - Method in interface org.apache.crunch.impl.spark.serde.SerDe
 
fromBytes(byte[]) - Method in class org.apache.crunch.impl.spark.serde.WritableSerDe
 
fromBytesFunction() - Method in class org.apache.crunch.impl.spark.serde.AvroSerDe
 
fromBytesFunction() - Method in interface org.apache.crunch.impl.spark.serde.SerDe
 
fromBytesFunction() - Method in class org.apache.crunch.impl.spark.serde.WritableSerDe
 
fromConfiguration(Configuration) - Static method in class org.apache.crunch.types.avro.AvroMode
Creates an AvroMode based on the AvroMode.AVRO_MODE_PROPERTY property in the conf.
fromSerialized(String, Configuration) - Static method in class org.apache.crunch.io.FormatBundle
 
fromShuffleConfiguration(Configuration) - Static method in class org.apache.crunch.types.avro.AvroMode
Creates an AvroMode based on the AvroMode.AVRO_SHUFFLE_MODE_PROPERTY property in the conf.
fromType(AvroType<?>) - Static method in class org.apache.crunch.types.avro.AvroMode
Creates an AvroMode based upon the specified type.
fullJoin(PTable<K, U>, PTable<K, V>) - Static method in class org.apache.crunch.lib.Join
Performs a full outer join on the specified PTables.
FullOuterJoinFn<K,U,V> - Class in org.apache.crunch.lib.join
Used to perform the last step of an full outer join.
FullOuterJoinFn(PType<K>, PType<U>) - Constructor for class org.apache.crunch.lib.join.FullOuterJoinFn
 

G

generateKeys(S) - Method in class org.apache.crunch.contrib.bloomfilter.BloomFilterFn
 
generateOutput(Pipeline) - Method in class org.apache.crunch.PipelineCallable
Called by the Pipeline when this instance is registered with Pipeline#sequentialDo.
GENERIC - Static variable in class org.apache.crunch.types.avro.AvroMode
Default mode to use for reading and writing Generic types.
generics(Schema) - Static method in class org.apache.crunch.types.avro.Avros
 
generics(Schema) - Method in class org.apache.crunch.types.avro.AvroTypeFamily
 
get() - Method in class org.apache.crunch.impl.spark.SparkRuntime
 
get(long, TimeUnit) - Method in class org.apache.crunch.impl.spark.SparkRuntime
 
get(int) - Method in class org.apache.crunch.Pair
 
get(int) - Method in class org.apache.crunch.test.Employee
 
get(int) - Method in class org.apache.crunch.test.Person
 
get(int) - Method in interface org.apache.crunch.Tuple
Returns the Object at the given index.
get(int) - Method in class org.apache.crunch.Tuple3
 
get(int) - Method in class org.apache.crunch.Tuple4
 
get(int) - Method in class org.apache.crunch.TupleN
 
get(int) - Method in class org.apache.crunch.types.writable.TupleWritable
Get ith Writable from Tuple.
getAge() - Method in class org.apache.crunch.test.Person.Builder
Gets the value of the 'age' field
getAge() - Method in class org.apache.crunch.test.Person
Gets the value of the 'age' field.
getAllPCollections() - Method in class org.apache.crunch.PipelineCallable
Returns the mapping of labels to PCollection dependencies for this instance.
getAllStructFieldRefs() - Method in class org.apache.crunch.types.orc.TupleObjectInspector
 
getAllTargets() - Method in class org.apache.crunch.PipelineCallable
Returns the mapping of labels to Target dependencies for this instance.
getAsOfTime() - Method in class org.apache.crunch.kafka.offset.hdfs.Offsets
Returns the time in milliseconds since epoch that the offset information was retrieved or valid as of.
getBrokerOffsets(Properties, long, String...) - Static method in class org.apache.crunch.kafka.KafkaUtils
Retrieves the offset values for an array of topics at the specified time.
getByFn() - Method in class org.apache.crunch.lib.sort.SortFns.KeyExtraction
 
getCategory() - Method in class org.apache.crunch.types.orc.TupleObjectInspector
 
getClassSchema() - Static method in class org.apache.crunch.test.Employee
 
getClassSchema() - Static method in class org.apache.crunch.test.Person
 
getCombineFn() - Method in class org.apache.crunch.impl.spark.SparkRuntime
 
getCompletionHooks() - Method in class org.apache.crunch.impl.mr.MRPipeline
 
getConf() - Method in class org.apache.crunch.io.FormatBundle
 
getConf() - Method in class org.apache.crunch.kafka.inputformat.KafkaInputFormat
 
getConf() - Method in class org.apache.crunch.lib.sort.TotalOrderPartitioner
 
getConf() - Method in class org.apache.crunch.lib.sort.TupleWritableComparator
 
getConf() - Method in class org.apache.crunch.types.writable.TupleWritable.Comparator
 
getConf() - Method in class org.apache.crunch.util.CrunchTool
 
getConfiguration() - Method in class org.apache.crunch.impl.dist.DistributedPipeline
 
getConfiguration() - Method in class org.apache.crunch.impl.mem.MemPipeline
 
getConfiguration() - Method in class org.apache.crunch.impl.spark.SparkRuntime
 
getConfiguration() - Method in class org.apache.crunch.impl.spark.SparkRuntimeContext
 
getConfiguration() - Method in interface org.apache.crunch.lambda.LDoFnContext
Get the current Hadoop Configuration
getConfiguration() - Method in interface org.apache.crunch.Pipeline
Returns the Configuration instance associated with this pipeline.
getContext() - Method in interface org.apache.crunch.lambda.LDoFnContext
Get the underlying TaskInputOutputContext (for special cases)
getConverter() - Method in class org.apache.crunch.kafka.KafkaSource
 
getConverter() - Method in interface org.apache.crunch.Source
Returns the Converter used for mapping the inputs from this instance into PCollection or PTable values.
getConverter(PType<?>) - Method in interface org.apache.crunch.Target
Returns the Converter to use for mapping from the output PCollection into the output values expected by this instance.
getConverter() - Method in class org.apache.crunch.types.avro.AvroType
 
getConverter() - Method in class org.apache.crunch.types.PGroupedTableType
 
getConverter() - Method in interface org.apache.crunch.types.PType
 
getConverter() - Method in class org.apache.crunch.types.writable.WritableType
 
getCounter(Enum<?>) - Static method in class org.apache.crunch.test.TestCounters
 
getCounter(String, String) - Static method in class org.apache.crunch.test.TestCounters
 
getCounter() - Method in class org.apache.hadoop.mapred.SparkCounter
 
getCounterDisplayName(String, String) - Method in class org.apache.crunch.PipelineResult.StageResult
 
getCounterDisplayName(Enum<?>) - Method in class org.apache.crunch.PipelineResult.StageResult
 
getCounterNames() - Method in class org.apache.crunch.PipelineResult.StageResult
 
getCounters() - Static method in class org.apache.crunch.impl.mem.MemPipeline
 
getCounters() - Method in class org.apache.crunch.PipelineResult.StageResult
Deprecated.
The Counter class changed incompatibly between Hadoop 1 and 2 (from a class to an interface) so user programs should avoid this method and use PipelineResult.StageResult.getCounterNames().
getCounterValue(String, String) - Method in class org.apache.crunch.PipelineResult.StageResult
 
getCounterValue(Enum<?>) - Method in class org.apache.crunch.PipelineResult.StageResult
 
getCurrentKey() - Method in class org.apache.crunch.kafka.inputformat.KafkaRecordReader
 
getCurrentValue() - Method in class org.apache.crunch.kafka.inputformat.KafkaRecordReader
 
getData() - Method in class org.apache.crunch.types.avro.AvroMode
Returns a GenericData instance based on the mode type.
getData() - Method in interface org.apache.crunch.types.avro.ReaderWriterFactory
 
getData() - Method in class org.apache.crunch.types.avro.ReflectDataFactory
 
getDataFileWriter(Path, Configuration) - Static method in class org.apache.crunch.types.avro.AvroOutputFormat
 
getDefaultConfiguration() - Method in class org.apache.crunch.test.TemporaryPath
 
getDefaultFileSource(Path) - Method in class org.apache.crunch.types.avro.AvroType
 
getDefaultFileSource(Path) - Method in class org.apache.crunch.types.PGroupedTableType
 
getDefaultFileSource(Path) - Method in interface org.apache.crunch.types.PType
Returns a SourceTarget that is able to read/write data using the serialization format specified by this PType.
getDefaultFileSource(Path) - Method in class org.apache.crunch.types.writable.WritableType
 
getDefaultInstance() - Static method in class org.apache.crunch.contrib.text.TokenizerFactory
Returns a default TokenizerFactory that uses whitespace as a delimiter and does not skip any input fields.
getDefaultInstance(Class<M>) - Static method in class org.apache.crunch.types.Protos
Utility function for creating a default PB Messgae from a Class object that works with both protoc 2.3.0 and 2.4.x.
getDefaultValue() - Method in class org.apache.crunch.contrib.text.AbstractSimpleExtractor
 
getDefaultValue() - Method in interface org.apache.crunch.contrib.text.Extractor
Returns the default value for this Extractor in case of an error.
getDepartment() - Method in class org.apache.crunch.test.Employee.Builder
Gets the value of the 'department' field
getDepartment() - Method in class org.apache.crunch.test.Employee
Gets the value of the 'department' field.
getDependentJobs() - Method in interface org.apache.crunch.impl.mr.MRJob
 
getDepth() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
getDetachedValue(PTableType<K, V>, Pair<K, V>) - Static method in class org.apache.crunch.lib.PTables
Create a detached value for a table Pair.
getDetachedValue(T) - Method in class org.apache.crunch.types.avro.AvroType
 
getDetachedValue(T) - Method in interface org.apache.crunch.types.PType
Returns a copy of a value (or the value itself) that can safely be retained.
getDetachedValue(T) - Method in class org.apache.crunch.types.writable.WritableType
 
getDisplayName() - Method in class org.apache.hadoop.mapred.SparkCounter
 
getEndingOffset() - Method in class org.apache.crunch.kafka.inputformat.KafkaInputSplit
Returns the ending offset for the split
getEndTimeMsec() - Method in class org.apache.crunch.PipelineResult.StageResult
 
getErrorCount() - Method in class org.apache.crunch.contrib.text.ExtractorStats
The overall number of records that had some kind of parsing error.
getFactory() - Method in class org.apache.crunch.impl.dist.DistributedPipeline
 
getFactory() - Method in class org.apache.crunch.types.avro.AvroMode
Returns the factory that will be used for the mode.
getFamily() - Method in class org.apache.crunch.types.avro.AvroType
 
getFamily() - Method in class org.apache.crunch.types.PGroupedTableType
 
getFamily() - Method in interface org.apache.crunch.types.PType
Returns the PTypeFamily that this PType belongs to.
getFamily() - Method in class org.apache.crunch.types.writable.WritableType
 
getFieldErrors() - Method in class org.apache.crunch.contrib.text.ExtractorStats
Returns the number of errors that occurred when parsing the individual fields of a composite record type, like a Pair or TupleN.
getFile(String) - Method in class org.apache.crunch.test.TemporaryPath
Get a File below the temporary directory.
getFileName(String) - Method in class org.apache.crunch.test.TemporaryPath
Get an absolute file name below the temporary directory.
getFileNamingScheme() - Method in interface org.apache.crunch.io.PathTarget
Get the naming scheme to be used for outputs being written to an output path.
getFirst() - Method in class org.apache.crunch.fn.CompositeMapFn
 
getFormatClass() - Method in class org.apache.crunch.io.FormatBundle
 
getFormatNodeMap(JobContext) - Static method in class org.apache.crunch.io.CrunchInputs
 
getGroupedDetachedValue(PGroupedTableType<K, V>, Pair<K, Iterable<V>>) - Static method in class org.apache.crunch.lib.PTables
Created a detached value for a PGroupedTable value.
getGroupedTableType() - Method in class org.apache.crunch.impl.dist.collect.BaseGroupedTable
 
getGroupedTableType() - Method in interface org.apache.crunch.PGroupedTable
Return the PGroupedTableType containing serialization information for this PGroupedTable.
getGroupedTableType() - Method in interface org.apache.crunch.types.PTableType
Returns the grouped table version of this type.
getGroupingComparator(PTypeFamily) - Static method in class org.apache.crunch.lib.join.JoinUtils
 
getGroupingComparatorClass() - Method in class org.apache.crunch.GroupingOptions
 
getGroupingConverter() - Method in class org.apache.crunch.types.PGroupedTableType
 
getIndex() - Method in class org.apache.crunch.types.writable.UnionWritable
 
getIndex() - Method in class org.apache.crunch.Union
Returns the index of the original data source for this union type.
getInputMapFn() - Method in class org.apache.crunch.types.avro.AvroType
 
getInputMapFn() - Method in interface org.apache.crunch.types.PType
 
getInputMapFn() - Method in class org.apache.crunch.types.writable.WritableType
 
getInstance() - Static method in class org.apache.crunch.fn.IdentityFn
 
getInstance() - Static method in class org.apache.crunch.impl.mem.MemPipeline
 
getInstance() - Static method in class org.apache.crunch.io.SequentialFileNamingScheme
 
getInstance() - Static method in class org.apache.crunch.types.avro.AvroTypeFamily
 
getInstance() - Static method in class org.apache.crunch.types.writable.TupleWritable.Comparator
 
getInstance() - Static method in class org.apache.crunch.types.writable.WritableTypeFamily
 
getJavaRDDLike(SparkRuntime) - Method in class org.apache.crunch.impl.spark.collect.CreatedCollection
 
getJavaRDDLike(SparkRuntime) - Method in class org.apache.crunch.impl.spark.collect.CreatedTable
 
getJavaRDDLike(SparkRuntime) - Method in class org.apache.crunch.impl.spark.collect.DoCollection
 
getJavaRDDLike(SparkRuntime) - Method in class org.apache.crunch.impl.spark.collect.DoTable
 
getJavaRDDLike(SparkRuntime) - Method in class org.apache.crunch.impl.spark.collect.EmptyPCollection
 
getJavaRDDLike(SparkRuntime) - Method in class org.apache.crunch.impl.spark.collect.EmptyPTable
 
getJavaRDDLike(SparkRuntime) - Method in class org.apache.crunch.impl.spark.collect.InputCollection
 
getJavaRDDLike(SparkRuntime) - Method in class org.apache.crunch.impl.spark.collect.InputTable
 
getJavaRDDLike(SparkRuntime) - Method in class org.apache.crunch.impl.spark.collect.PGroupedTableImpl
 
getJavaRDDLike(SparkRuntime) - Method in class org.apache.crunch.impl.spark.collect.UnionCollection
 
getJavaRDDLike(SparkRuntime) - Method in class org.apache.crunch.impl.spark.collect.UnionTable
 
getJavaRDDLike(SparkRuntime) - Method in interface org.apache.crunch.impl.spark.SparkCollection
 
getJob() - Method in interface org.apache.crunch.impl.mr.MRJob
 
getJobEndTimeMsec() - Method in class org.apache.crunch.PipelineResult.StageResult
 
getJobID() - Method in interface org.apache.crunch.impl.mr.MRJob
 
getJobs() - Method in interface org.apache.crunch.impl.mr.MRPipelineExecution
 
getJobStartTimeMsec() - Method in class org.apache.crunch.PipelineResult.StageResult
 
getJobState() - Method in interface org.apache.crunch.impl.mr.MRJob
 
getJoinType() - Method in class org.apache.crunch.lib.join.FullOuterJoinFn
getJoinType() - Method in class org.apache.crunch.lib.join.InnerJoinFn
 
getJoinType() - Method in class org.apache.crunch.lib.join.JoinFn
 
getJoinType() - Method in class org.apache.crunch.lib.join.LeftOuterJoinFn
getJoinType() - Method in class org.apache.crunch.lib.join.RightOuterJoinFn
getKafkaConnectionProperties(Configuration) - Static method in class org.apache.crunch.kafka.KafkaUtils
Converts the provided config into a Properties object to connect with Kafka.
getKeyClass() - Method in interface org.apache.crunch.types.Converter
 
getKeyType() - Method in class org.apache.crunch.impl.dist.collect.PTableBase
 
getKeyType() - Method in class org.apache.crunch.lib.sort.SortFns.KeyExtraction
 
getKeyType() - Method in interface org.apache.crunch.PTable
Returns the PType of the key.
getKeyType() - Method in interface org.apache.crunch.types.PTableType
Returns the key type for the table.
getLastModifiedAt(Configuration) - Method in class org.apache.crunch.contrib.io.jdbc.DataBaseSource
 
getLastModifiedAt() - Method in class org.apache.crunch.impl.dist.collect.BaseDoCollection
 
getLastModifiedAt() - Method in class org.apache.crunch.impl.dist.collect.BaseDoTable
 
getLastModifiedAt() - Method in class org.apache.crunch.impl.dist.collect.BaseGroupedTable
 
getLastModifiedAt() - Method in class org.apache.crunch.impl.dist.collect.BaseInputCollection
 
getLastModifiedAt() - Method in class org.apache.crunch.impl.dist.collect.BaseInputTable
 
getLastModifiedAt() - Method in class org.apache.crunch.impl.dist.collect.BaseUnionCollection
 
getLastModifiedAt() - Method in class org.apache.crunch.impl.dist.collect.BaseUnionTable
 
getLastModifiedAt() - Method in class org.apache.crunch.impl.dist.collect.EmptyPCollection
 
getLastModifiedAt() - Method in class org.apache.crunch.impl.dist.collect.EmptyPTable
 
getLastModifiedAt() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
The time of the most recent modification to one of the input sources to the collection.
getLastModifiedAt() - Method in class org.apache.crunch.impl.spark.collect.CreatedCollection
 
getLastModifiedAt() - Method in class org.apache.crunch.impl.spark.collect.CreatedTable
 
getLastModifiedAt(FileSystem, Path) - Static method in class org.apache.crunch.io.SourceTargetHelper
 
getLastModifiedAt(Configuration) - Method in class org.apache.crunch.kafka.KafkaSource
 
getLastModifiedAt(Configuration) - Method in interface org.apache.crunch.Source
Returns the time (in milliseconds) that this Source was most recently modified (e.g., because an input file was edited or new files were added to a directory.)
getLength() - Method in class org.apache.crunch.kafka.inputformat.KafkaInputSplit
 
getLocations() - Method in class org.apache.crunch.kafka.inputformat.KafkaInputSplit
 
getMapOutputName(Configuration, Path) - Method in interface org.apache.crunch.io.FileNamingScheme
Get the output file name for a map task.
getMapOutputName(Configuration, Path) - Method in class org.apache.crunch.io.SequentialFileNamingScheme
 
getMaterializedAt() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
getMaterializeSourceTarget(PCollection<T>) - Method in class org.apache.crunch.impl.dist.DistributedPipeline
Retrieve a ReadableSourceTarget that provides access to the contents of a PCollection.
getMessage() - Method in class org.apache.crunch.PipelineCallable
Returns a message associated with this callable's execution, especially in case of errors.
getModeProperties() - Method in class org.apache.crunch.types.avro.AvroMode
Returns the entries that a Configuration instance needs to enable this AvroMode as a serializable map of key-value pairs.
getName() - Method in class org.apache.crunch.CreateOptions
 
getName() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
getName() - Method in class org.apache.crunch.impl.dist.DistributedPipeline
 
getName() - Method in class org.apache.crunch.impl.mem.MemPipeline
 
getName() - Method in class org.apache.crunch.io.FormatBundle
 
getName() - Method in interface org.apache.crunch.PCollection
Returns a shorthand name for this PCollection.
getName() - Method in interface org.apache.crunch.Pipeline
Returns the name of this pipeline.
getName() - Method in class org.apache.crunch.PipelineCallable
Returns the name of this instance.
getName() - Method in class org.apache.crunch.test.Employee.Builder
Gets the value of the 'name' field
getName() - Method in class org.apache.crunch.test.Employee
Gets the value of the 'name' field.
getName() - Method in class org.apache.crunch.test.Person.Builder
Gets the value of the 'name' field
getName() - Method in class org.apache.crunch.test.Person
Gets the value of the 'name' field.
getName() - Method in class org.apache.hadoop.mapred.SparkCounter
 
getNamedDotFiles() - Method in class org.apache.crunch.impl.spark.SparkRuntime
 
getNamedDotFiles() - Method in interface org.apache.crunch.PipelineExecution
Returns all .dot files that allows a client to graph the Crunch execution plan internals.
getNamedOutputs(Configuration) - Static method in class org.apache.crunch.io.CrunchOutputs
 
getNextAnonymousStageId() - Method in class org.apache.crunch.impl.dist.DistributedPipeline
 
getNumReducers() - Method in class org.apache.crunch.GroupingOptions
 
getNumShards(K) - Method in interface org.apache.crunch.lib.join.ShardedJoinStrategy.ShardingStrategy
Retrieve the number of shards over which the given key should be split.
getOffset() - Method in class org.apache.crunch.kafka.offset.hdfs.Offsets.PartitionOffset
Returns the offset
getOffsets(Configuration) - Static method in class org.apache.crunch.kafka.inputformat.KafkaInputFormat
Reads the configuration to determine which topics, partitions, and offsets should be used for reading data.
getOffsets() - Method in class org.apache.crunch.kafka.offset.hdfs.Offsets
The collection of offset information for specific topics and partitions.
getOnlyParent() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
getOutputCommitter(TaskAttemptContext) - Static method in class org.apache.crunch.io.CrunchOutputs
 
getOutputMapFn() - Method in class org.apache.crunch.types.avro.AvroType
 
getOutputMapFn() - Method in interface org.apache.crunch.types.PType
 
getOutputMapFn() - Method in class org.apache.crunch.types.writable.WritableType
 
getParallelDoOptions() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
getParallelism() - Method in class org.apache.crunch.CreateOptions
 
getParents() - Method in class org.apache.crunch.impl.dist.collect.BaseDoCollection
 
getParents() - Method in class org.apache.crunch.impl.dist.collect.BaseDoTable
 
getParents() - Method in class org.apache.crunch.impl.dist.collect.BaseGroupedTable
 
getParents() - Method in class org.apache.crunch.impl.dist.collect.BaseInputCollection
 
getParents() - Method in class org.apache.crunch.impl.dist.collect.BaseInputTable
 
getParents() - Method in class org.apache.crunch.impl.dist.collect.BaseUnionCollection
 
getParents() - Method in class org.apache.crunch.impl.dist.collect.BaseUnionTable
 
getParents() - Method in class org.apache.crunch.impl.dist.collect.EmptyPCollection
 
getParents() - Method in class org.apache.crunch.impl.dist.collect.EmptyPTable
 
getParents() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
getParents() - Method in class org.apache.crunch.impl.spark.collect.CreatedCollection
 
getParents() - Method in class org.apache.crunch.impl.spark.collect.CreatedTable
 
getPartition(Object) - Method in class org.apache.crunch.impl.spark.SparkPartitioner
 
getPartition() - Method in class org.apache.crunch.kafka.offset.hdfs.Offsets.PartitionOffset
Returns the partition
getPartition(Object, Object, int) - Method in class org.apache.crunch.lib.join.JoinUtils.AvroIndexedRecordPartitioner
 
getPartition(TupleWritable, Writable, int) - Method in class org.apache.crunch.lib.join.JoinUtils.TupleWritablePartitioner
 
getPartition(K, V, int) - Method in class org.apache.crunch.lib.sort.TotalOrderPartitioner
 
getPartitionerClass() - Method in class org.apache.crunch.GroupingOptions
 
getPartitionerClass(PTypeFamily) - Static method in class org.apache.crunch.lib.join.JoinUtils
 
getPartitionFile(Configuration) - Static method in class org.apache.crunch.lib.sort.TotalOrderPartitioner
 
getPath() - Method in interface org.apache.crunch.io.PathTarget
 
getPath(String) - Method in class org.apache.crunch.test.TemporaryPath
Get a Path below the temporary directory.
getPathSize(Configuration, Path) - Static method in class org.apache.crunch.io.SourceTargetHelper
 
getPathSize(FileSystem, Path) - Static method in class org.apache.crunch.io.SourceTargetHelper
 
getPathToCacheFile(Path, Configuration) - Static method in class org.apache.crunch.util.DistCache
 
getPersistedTimeStoragePath(Path, long) - Static method in class org.apache.crunch.kafka.offset.hdfs.HDFSOffsetWriter
Creates a Path for storing the offsets for a specified persistedTime.
getPipeline() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
getPipeline() - Method in interface org.apache.crunch.PCollection
Returns the Pipeline associated with this PCollection.
getPlanDotFile() - Method in class org.apache.crunch.impl.spark.SparkRuntime
 
getPlanDotFile() - Method in interface org.apache.crunch.PipelineExecution
Returns the .dot file that allows a client to graph the Crunch execution plan for this pipeline.
getPrepareHooks() - Method in class org.apache.crunch.impl.mr.MRPipeline
 
getPrimitiveType(Class<T>) - Static method in class org.apache.crunch.types.avro.Avros
 
getPrimitiveType(Class<T>) - Static method in class org.apache.crunch.types.writable.Writables
 
getProgress() - Method in class org.apache.crunch.kafka.inputformat.KafkaRecordReader
 
getPTableType() - Method in class org.apache.crunch.impl.dist.collect.BaseDoTable
 
getPTableType() - Method in class org.apache.crunch.impl.dist.collect.BaseInputTable
 
getPTableType() - Method in class org.apache.crunch.impl.dist.collect.BaseUnionTable
 
getPTableType() - Method in class org.apache.crunch.impl.dist.collect.EmptyPTable
 
getPTableType() - Method in class org.apache.crunch.impl.spark.collect.CreatedTable
 
getPTableType() - Method in interface org.apache.crunch.PTable
Returns the PTableType of this PTable.
getPType(PTypeFamily) - Method in interface org.apache.crunch.contrib.text.Extractor
Returns the PType associated with this data type for the given PTypeFamily.
getPType() - Method in class org.apache.crunch.impl.dist.collect.BaseDoCollection
 
getPType() - Method in class org.apache.crunch.impl.dist.collect.BaseDoTable
 
getPType() - Method in class org.apache.crunch.impl.dist.collect.BaseGroupedTable
 
getPType() - Method in class org.apache.crunch.impl.dist.collect.BaseInputCollection
 
getPType() - Method in class org.apache.crunch.impl.dist.collect.BaseInputTable
 
getPType() - Method in class org.apache.crunch.impl.dist.collect.BaseUnionCollection
 
getPType() - Method in class org.apache.crunch.impl.dist.collect.BaseUnionTable
 
getPType() - Method in class org.apache.crunch.impl.dist.collect.EmptyPCollection
 
getPType() - Method in class org.apache.crunch.impl.dist.collect.EmptyPTable
 
getPType() - Method in class org.apache.crunch.impl.spark.collect.CreatedCollection
 
getPType() - Method in class org.apache.crunch.impl.spark.collect.CreatedTable
 
getPType() - Method in interface org.apache.crunch.PCollection
Returns the PType of this PCollection.
getReader(Schema) - Method in class org.apache.crunch.types.avro.AvroMode
Creates a DatumReader based on the schema.
getReader(Schema) - Method in interface org.apache.crunch.types.avro.ReaderWriterFactory
 
getReader(Schema) - Method in class org.apache.crunch.types.avro.ReflectDataFactory
 
getRecommendedPartitions(PCollection<T>) - Static method in class org.apache.crunch.util.PartitionUtils
 
getRecommendedPartitions(PCollection<T>, Configuration) - Static method in class org.apache.crunch.util.PartitionUtils
 
getRecordType() - Method in class org.apache.crunch.types.avro.AvroType
 
getRecordWriter(TaskAttemptContext) - Method in class org.apache.crunch.types.avro.AvroOutputFormat
 
getRecordWriter(TaskAttemptContext) - Method in class org.apache.crunch.types.avro.AvroPathPerKeyOutputFormat
 
getRecordWriter(TaskAttemptContext) - Method in class org.apache.crunch.types.avro.AvroTextOutputFormat
 
getReduceOutputName(Configuration, Path, int) - Method in interface org.apache.crunch.io.FileNamingScheme
Get the output file name for a reduce task.
getReduceOutputName(Configuration, Path, int) - Method in class org.apache.crunch.io.SequentialFileNamingScheme
 
getReflectDataFactory(Configuration) - Static method in class org.apache.crunch.types.avro.Avros
Deprecated.
as of 0.9.0; use AvroMode.fromConfiguration(conf)
getResult() - Method in class org.apache.crunch.impl.spark.SparkRuntime
 
getResult() - Method in interface org.apache.crunch.PipelineExecution
Retrieve the result of a pipeline if it has been completed, otherwise null.
getRootFile() - Method in class org.apache.crunch.test.TemporaryPath
Get the root directory which will be deleted automatically.
getRootFileName() - Method in class org.apache.crunch.test.TemporaryPath
Get the root directory as an absolute file name.
getRootPath() - Method in class org.apache.crunch.test.TemporaryPath
Get the root directory as a Path.
getRuntimeContext() - Method in class org.apache.crunch.impl.spark.SparkRuntime
 
getSalary() - Method in class org.apache.crunch.test.Employee.Builder
Gets the value of the 'salary' field
getSalary() - Method in class org.apache.crunch.test.Employee
Gets the value of the 'salary' field.
getSchema() - Method in class org.apache.crunch.test.Employee
 
getSchema() - Method in class org.apache.crunch.test.Person
 
getSchema() - Method in class org.apache.crunch.types.avro.AvroType
 
getSecond() - Method in class org.apache.crunch.fn.CompositeMapFn
 
getSerializationClass() - Method in class org.apache.crunch.types.writable.WritableType
 
getSiblingnames() - Method in class org.apache.crunch.test.Person.Builder
Gets the value of the 'siblingnames' field
getSiblingnames() - Method in class org.apache.crunch.test.Person
Gets the value of the 'siblingnames' field.
getSize(Configuration) - Method in class org.apache.crunch.contrib.io.jdbc.DataBaseSource
 
getSize() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
getSize(Configuration) - Method in class org.apache.crunch.kafka.KafkaSource
 
getSize() - Method in interface org.apache.crunch.PCollection
Returns the size of the data represented by this PCollection in bytes.
getSize(Configuration) - Method in interface org.apache.crunch.Source
Returns the number of bytes in this Source.
getSortComparatorClass() - Method in class org.apache.crunch.GroupingOptions
 
getSource() - Method in class org.apache.crunch.impl.dist.collect.BaseInputCollection
 
getSource() - Method in class org.apache.crunch.impl.dist.collect.BaseInputTable
 
getSourceTargets() - Method in class org.apache.crunch.GroupingOptions
 
getSourceTargets() - Method in class org.apache.crunch.ParallelDoOptions
Deprecated.
getSourceTargets() - Method in interface org.apache.crunch.ReadableData
 
getSourceTargets() - Method in class org.apache.crunch.util.DelegatingReadableData
 
getSourceTargets() - Method in class org.apache.crunch.util.UnionReadableData
 
getSparkContext() - Method in class org.apache.crunch.impl.spark.SparkRuntime
 
getSpecificClassLoader() - Static method in class org.apache.crunch.types.avro.AvroMode
Get the configured ClassLoader to be used for loading Avro org.apache.specific.SpecificRecord and reflection implementation classes.
getSplits(JobContext) - Method in class org.apache.crunch.kafka.inputformat.KafkaInputFormat
 
getStageId() - Method in class org.apache.crunch.PipelineResult.StageResult
 
getStageName() - Method in class org.apache.crunch.PipelineResult.StageResult
 
getStageResults() - Method in class org.apache.crunch.PipelineResult
 
getStartingOffset() - Method in class org.apache.crunch.kafka.inputformat.KafkaInputSplit
Returns the starting offset for the split
getStartTimeMsec() - Method in class org.apache.crunch.PipelineResult.StageResult
 
getStats() - Method in class org.apache.crunch.contrib.text.AbstractCompositeExtractor
 
getStats() - Method in class org.apache.crunch.contrib.text.AbstractSimpleExtractor
 
getStats() - Method in interface org.apache.crunch.contrib.text.Extractor
Return statistics about how many errors this Extractor instance encountered while parsing input data.
getStatus() - Method in class org.apache.crunch.impl.spark.SparkRuntime
 
getStatus() - Method in interface org.apache.crunch.PipelineExecution
 
getStorageLevel(PCollection<?>) - Method in class org.apache.crunch.impl.spark.SparkRuntime
 
getStoredOffsetPersistenceTimes() - Method in class org.apache.crunch.kafka.offset.AbstractOffsetReader
 
getStoredOffsetPersistenceTimes() - Method in class org.apache.crunch.kafka.offset.hdfs.HDFSOffsetReader
 
getStoredOffsetPersistenceTimes() - Method in interface org.apache.crunch.kafka.offset.OffsetReader
Returns the list of available persistence times offsets have been written to the underlying storage mechanism.
getStructFieldData(Object, StructField) - Method in class org.apache.crunch.types.orc.TupleObjectInspector
 
getStructFieldRef(String) - Method in class org.apache.crunch.types.orc.TupleObjectInspector
 
getStructFieldsDataAsList(Object) - Method in class org.apache.crunch.types.orc.TupleObjectInspector
 
getSubTypes() - Method in class org.apache.crunch.types.avro.AvroType
 
getSubTypes() - Method in class org.apache.crunch.types.PGroupedTableType
 
getSubTypes() - Method in interface org.apache.crunch.types.PType
Returns the sub-types that make up this PType if it is a composite instance, such as a tuple.
getSubTypes() - Method in class org.apache.crunch.types.writable.WritableType
 
getTableType() - Method in class org.apache.crunch.kafka.KafkaSource
 
getTableType() - Method in interface org.apache.crunch.TableSource
 
getTableType() - Method in class org.apache.crunch.types.PGroupedTableType
 
getTargetDependencies() - Method in class org.apache.crunch.impl.dist.collect.BaseGroupedTable
 
getTargetDependencies() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
getTargets() - Method in class org.apache.crunch.ParallelDoOptions
 
getTestContext(Configuration) - Static method in class org.apache.crunch.test.CrunchTestSupport
The method creates a TaskInputOutputContext which can be used in unit tests.
getTopic() - Method in class org.apache.crunch.kafka.offset.hdfs.Offsets.PartitionOffset
Returns the topic
getTopicPartition() - Method in class org.apache.crunch.kafka.inputformat.KafkaInputSplit
Returns the topic and partition for the split
getTupleFactory(Class<T>) - Static method in class org.apache.crunch.types.TupleFactory
Get the TupleFactory for a given Tuple implementation.
getType() - Method in class org.apache.crunch.kafka.KafkaSource
 
getType() - Method in interface org.apache.crunch.Source
Returns the PType for this source.
getTypeClass() - Method in class org.apache.crunch.types.avro.AvroType
 
getTypeClass() - Method in interface org.apache.crunch.types.PType
Returns the Java type represented by this PType.
getTypeClass() - Method in class org.apache.crunch.types.writable.WritableType
 
getTypeFamily() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
getTypeFamily() - Method in interface org.apache.crunch.PCollection
Returns the PTypeFamily of this PCollection.
getTypeInfo(Class<?>) - Static method in class org.apache.crunch.types.orc.OrcUtils
Generate TypeInfo for a given java class based on reflection
getTypeName() - Method in class org.apache.crunch.types.orc.TupleObjectInspector
 
getValue() - Method in interface org.apache.crunch.PObject
Gets the value associated with this PObject.
getValue() - Method in class org.apache.crunch.types.writable.UnionWritable
 
getValue() - Method in class org.apache.crunch.Union
Returns the underlying object value of the record.
getValue() - Method in class org.apache.hadoop.mapred.SparkCounter
 
getValueClass() - Method in interface org.apache.crunch.types.Converter
 
getValues() - Method in class org.apache.crunch.TupleN
 
getValueType() - Method in class org.apache.crunch.impl.dist.collect.PTableBase
 
getValueType() - Method in interface org.apache.crunch.PTable
Returns the PType of the value.
getValueType() - Method in interface org.apache.crunch.types.PTableType
Returns the value type for the table.
getWriter(Schema) - Method in class org.apache.crunch.types.avro.AvroMode
Creates a DatumWriter based on the schema.
getWriter(Schema) - Method in interface org.apache.crunch.types.avro.ReaderWriterFactory
 
getWriter(Schema) - Method in class org.apache.crunch.types.avro.ReflectDataFactory
 
globalToplist(PCollection<X>) - Static method in class org.apache.crunch.lib.TopList
Create a list of unique items in the input collection with their count, sorted descending by their frequency.
groupByKey() - Method in class org.apache.crunch.impl.dist.collect.PTableBase
 
groupByKey(int) - Method in class org.apache.crunch.impl.dist.collect.PTableBase
 
groupByKey(GroupingOptions) - Method in class org.apache.crunch.impl.dist.collect.PTableBase
 
groupByKey() - Method in interface org.apache.crunch.lambda.LTable
Group this table by key to yield a LGroupedTable
groupByKey(int) - Method in interface org.apache.crunch.lambda.LTable
Group this table by key to yield a LGroupedTable
groupByKey(GroupingOptions) - Method in interface org.apache.crunch.lambda.LTable
Group this table by key to yield a LGroupedTable
groupByKey() - Method in interface org.apache.crunch.PTable
Performs a grouping operation on the keys of this table.
groupByKey(int) - Method in interface org.apache.crunch.PTable
Performs a grouping operation on the keys of this table, using the given number of partitions.
groupByKey(GroupingOptions) - Method in interface org.apache.crunch.PTable
Performs a grouping operation on the keys of this table, using the additional GroupingOptions to control how the grouping is executed.
groupedWeightedReservoirSample(PTable<Integer, Pair<T, N>>, int[]) - Static method in class org.apache.crunch.lib.Sample
The most general purpose of the weighted reservoir sampling patterns that allows us to choose a random sample of elements for each of N input groups.
groupedWeightedReservoirSample(PTable<Integer, Pair<T, N>>, int[], Long) - Static method in class org.apache.crunch.lib.Sample
Same as the other groupedWeightedReservoirSample method, but include a seed for testing purposes.
groupingComparatorClass(Class<? extends RawComparator>) - Method in class org.apache.crunch.GroupingOptions.Builder
 
GroupingOptions - Class in org.apache.crunch
Options that can be passed to a groupByKey operation in order to exercise finer control over how the partitioning, grouping, and sorting of keys is performed.
GroupingOptions.Builder - Class in org.apache.crunch
Builder class for creating GroupingOptions instances.
GuavaUtils - Class in org.apache.crunch.impl.spark
 
GuavaUtils() - Constructor for class org.apache.crunch.impl.spark.GuavaUtils
 
gzip(T) - Static method in class org.apache.crunch.io.Compress
Configure the given output target to be compressed using Gzip.

H

handleExisting(Target.WriteMode, long, Configuration) - Method in interface org.apache.crunch.Target
Apply the given WriteMode to this Target instance.
handleOutputs(Configuration, Path, int) - Method in interface org.apache.crunch.io.PathTarget
Handles moving the output data for this target from a temporary location on the filesystem to its target path at the end of a MapReduce job.
has(int) - Method in class org.apache.crunch.types.writable.TupleWritable
Return true if tuple has an element at the position provided.
hasAge() - Method in class org.apache.crunch.test.Person.Builder
Checks whether the 'age' field has been set
hasDepartment() - Method in class org.apache.crunch.test.Employee.Builder
Checks whether the 'department' field has been set
hashCode() - Method in class org.apache.crunch.impl.dist.collect.BaseInputCollection
 
hashCode() - Method in class org.apache.crunch.impl.dist.collect.BaseInputTable
 
hashCode() - Method in class org.apache.crunch.impl.spark.ByteArray
 
hashCode() - Method in class org.apache.crunch.impl.spark.IntByteArray
 
hashCode() - Method in class org.apache.crunch.io.FormatBundle
 
hashCode() - Method in class org.apache.crunch.kafka.offset.hdfs.Offsets
 
hashCode() - Method in class org.apache.crunch.kafka.offset.hdfs.Offsets.PartitionOffset
 
hashCode() - Method in class org.apache.crunch.lib.Quantiles.Result
 
hashCode() - Method in class org.apache.crunch.Pair
 
hashCode() - Method in class org.apache.crunch.Tuple3
 
hashCode() - Method in class org.apache.crunch.Tuple4
 
hashCode() - Method in class org.apache.crunch.TupleN
 
hashCode() - Method in class org.apache.crunch.types.avro.AvroMode
 
hashCode() - Method in class org.apache.crunch.types.avro.AvroType
 
hashCode() - Method in class org.apache.crunch.types.writable.TupleWritable
 
hashCode() - Method in class org.apache.crunch.types.writable.WritableType
 
hashCode() - Method in class org.apache.crunch.Union
 
HashUtil - Class in org.apache.crunch.util
Utility methods for working with hash codes.
HashUtil() - Constructor for class org.apache.crunch.util.HashUtil
 
hasName() - Method in class org.apache.crunch.test.Employee.Builder
Checks whether the 'name' field has been set
hasName() - Method in class org.apache.crunch.test.Person.Builder
Checks whether the 'name' field has been set
hasNext() - Method in class org.apache.crunch.contrib.text.Tokenizer
Returns true if the underlying Scanner has any tokens remaining.
hasNext() - Method in class org.apache.crunch.util.DoFnIterator
 
hasReflect() - Method in class org.apache.crunch.types.avro.AvroType
Determine if the wrapped type is a reflection-based avro type or wraps one.
hasSalary() - Method in class org.apache.crunch.test.Employee.Builder
Checks whether the 'salary' field has been set
hasSiblingnames() - Method in class org.apache.crunch.test.Person.Builder
Checks whether the 'siblingnames' field has been set
hasSpecific() - Method in class org.apache.crunch.types.avro.AvroType
Determine if the wrapped type is a specific data avro type or wraps one.
HDFSOffsetReader - Class in org.apache.crunch.kafka.offset.hdfs
Reader implementation that reads offset information from HDFS.
HDFSOffsetReader(Configuration, Path) - Constructor for class org.apache.crunch.kafka.offset.hdfs.HDFSOffsetReader
Creates a reader instance for interacting with the storage specified by the config and with the base storage path of baseStoragePath.
HDFSOffsetWriter - Class in org.apache.crunch.kafka.offset.hdfs
Offset writer implementation that stores the offsets in HDFS.
HDFSOffsetWriter(Configuration, Path) - Constructor for class org.apache.crunch.kafka.offset.hdfs.HDFSOffsetWriter
Creates a writer instance for interacting with the storage specified by the config and with the base storage path of baseStoragePath.

I

id - Variable in class org.apache.crunch.contrib.io.jdbc.IdentifiableName
 
IdentifiableName - Class in org.apache.crunch.contrib.io.jdbc
 
IdentifiableName() - Constructor for class org.apache.crunch.contrib.io.jdbc.IdentifiableName
 
IdentityFn<T> - Class in org.apache.crunch.fn
 
immutableType(Class<T>, Class<W>, MapFn<W, T>, MapFn<T, W>, PType...) - Static method in class org.apache.crunch.types.writable.WritableType
Factory method for a new WritableType instance whose type class is immutable.
increment(Enum<?>) - Method in interface org.apache.crunch.lambda.LCollection
Increment a counter for every element in the collection
increment(String, String) - Method in interface org.apache.crunch.lambda.LCollection
Increment a counter for every element in the collection
increment(String, String) - Method in interface org.apache.crunch.lambda.LDoFnContext
Increment a counter by 1
increment(String, String, long) - Method in interface org.apache.crunch.lambda.LDoFnContext
Increment a counter by value
increment(Enum<?>) - Method in interface org.apache.crunch.lambda.LDoFnContext
Increment a counter by 1
increment(Enum<?>, long) - Method in interface org.apache.crunch.lambda.LDoFnContext
Increment a counter by value
increment(Enum<?>) - Method in interface org.apache.crunch.lambda.LTable
Increment a counter for every element in the collection
increment(String, String) - Method in interface org.apache.crunch.lambda.LTable
Increment a counter for every element in the collection
increment(long) - Method in class org.apache.hadoop.mapred.SparkCounter
 
incrementIf(Enum<?>, SPredicate<S>) - Method in interface org.apache.crunch.lambda.LCollection
Increment a counter for every element satisfying the conditional predicate supplied.
incrementIf(String, String, SPredicate<S>) - Method in interface org.apache.crunch.lambda.LCollection
Increment a counter for every element satisfying the conditional predicate supplied.
incrementIf(Enum<?>, SPredicate<Pair<K, V>>) - Method in interface org.apache.crunch.lambda.LTable
Increment a counter for every element satisfying the conditional predicate supplied.
incrementIf(String, String, SPredicate<Pair<K, V>>) - Method in interface org.apache.crunch.lambda.LTable
Increment a counter for every element satisfying the conditional predicate supplied.
initialize(Configuration) - Method in interface org.apache.crunch.Aggregator
Perform any setup of this instance that is required prior to processing inputs.
initialize() - Method in class org.apache.crunch.contrib.bloomfilter.BloomFilterFn
 
initialize() - Method in class org.apache.crunch.contrib.text.AbstractCompositeExtractor
 
initialize() - Method in class org.apache.crunch.contrib.text.AbstractSimpleExtractor
 
initialize() - Method in interface org.apache.crunch.contrib.text.Extractor
Perform any initialization required by this Extractor during the start of a map or reduce task.
initialize() - Method in class org.apache.crunch.DoFn
Initialize this DoFn.
initialize(Configuration) - Method in class org.apache.crunch.fn.Aggregators.SimpleAggregator
 
initialize() - Method in class org.apache.crunch.fn.CompositeMapFn
 
initialize() - Method in class org.apache.crunch.fn.ExtractKeyFn
 
initialize() - Method in class org.apache.crunch.fn.PairMapFn
 
initialize(DoFn<?, ?>, Integer) - Method in class org.apache.crunch.impl.spark.SparkRuntimeContext
 
initialize(InputSplit, TaskAttemptContext) - Method in class org.apache.crunch.kafka.inputformat.KafkaRecordReader
 
initialize() - Method in class org.apache.crunch.lib.Aggregate.TopKCombineFn
 
initialize() - Method in class org.apache.crunch.lib.Aggregate.TopKFn
 
initialize() - Method in class org.apache.crunch.lib.join.FullOuterJoinFn
Initialize this DoFn.
initialize() - Method in class org.apache.crunch.lib.join.InnerJoinFn
 
initialize() - Method in class org.apache.crunch.lib.join.JoinFn
 
initialize() - Method in class org.apache.crunch.lib.join.LeftOuterJoinFn
Initialize this DoFn.
initialize() - Method in class org.apache.crunch.lib.join.RightOuterJoinFn
Initialize this DoFn.
initialize() - Method in class org.apache.crunch.lib.sort.SortFns.AvroGenericFn
 
initialize(Configuration) - Method in class org.apache.crunch.types.avro.AvroDerivedValueDeepCopier
 
initialize(Configuration) - Method in class org.apache.crunch.types.avro.AvroType
 
initialize(Configuration) - Method in class org.apache.crunch.types.CollectionDeepCopier
 
initialize(Configuration) - Method in interface org.apache.crunch.types.DeepCopier
Initialize the deep copier with a job-specific configuration
initialize(Configuration) - Method in class org.apache.crunch.types.MapDeepCopier
 
initialize(Configuration) - Method in class org.apache.crunch.types.NoOpDeepCopier
 
initialize() - Method in class org.apache.crunch.types.PGroupedTableType.PairIterableMapFn
 
initialize(Configuration) - Method in interface org.apache.crunch.types.PType
Initialize this PType for use within a DoFn.
initialize(Configuration) - Method in class org.apache.crunch.types.TupleDeepCopier
 
initialize() - Method in class org.apache.crunch.types.TupleFactory
 
initialize(Configuration) - Method in class org.apache.crunch.types.UnionDeepCopier
 
initialize(Configuration) - Method in class org.apache.crunch.types.writable.WritableDeepCopier
 
initialize(Configuration) - Method in class org.apache.crunch.types.writable.WritableType
 
inMemory(PTable<K, V>, double, double...) - Static method in class org.apache.crunch.lib.Quantiles
Calculate a set of quantiles for each key in a numerically-valued table.
innerJoin(PTable<K, U>, PTable<K, V>) - Static method in class org.apache.crunch.lib.Join
Performs an inner join on the specified PTables.
InnerJoinFn<K,U,V> - Class in org.apache.crunch.lib.join
Used to perform the last step of an inner join.
InnerJoinFn(PType<K>, PType<U>) - Constructor for class org.apache.crunch.lib.join.InnerJoinFn
 
InputCollection<S> - Class in org.apache.crunch.impl.spark.collect
 
inputConf(String, String) - Method in class org.apache.crunch.kafka.KafkaSource
 
inputConf(String, String) - Method in interface org.apache.crunch.Source
Adds the given key-value pair to the Configuration instance that is used to read this Source<T></T>.
InputConverterFunction<K,V,S> - Class in org.apache.crunch.impl.spark.fn
 
InputConverterFunction(Converter<K, V, S, ?>) - Constructor for class org.apache.crunch.impl.spark.fn.InputConverterFunction
 
InputTable<K,V> - Class in org.apache.crunch.impl.spark.collect
 
InputTable(TableSource<K, V>, String, DistributedPipeline, ParallelDoOptions) - Constructor for class org.apache.crunch.impl.spark.collect.InputTable
 
IntByteArray - Class in org.apache.crunch.impl.spark
 
IntByteArray(int, ByteArray) - Constructor for class org.apache.crunch.impl.spark.IntByteArray
 
intersection(PCollection<T>, PCollection<T>) - Static method in class org.apache.crunch.lib.Set
Compute the intersection of two sets of elements.
ints() - Static method in class org.apache.crunch.types.avro.Avros
 
ints() - Method in class org.apache.crunch.types.avro.AvroTypeFamily
 
ints() - Method in interface org.apache.crunch.types.PTypeFamily
 
ints() - Static method in class org.apache.crunch.types.writable.Writables
 
ints() - Method in class org.apache.crunch.types.writable.WritableTypeFamily
 
isBreakpoint() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
isCompatibleWith(GroupingOptions) - Method in class org.apache.crunch.GroupingOptions
 
isGeneric() - Method in class org.apache.crunch.types.avro.AvroType
Determine if the wrapped type is a generic data avro type.
isValid(JavaRDDLike<?, ?>) - Method in class org.apache.crunch.impl.spark.SparkRuntime
 
iterator() - Method in class org.apache.crunch.impl.SingleUseIterable
 
iterator() - Method in class org.apache.crunch.impl.spark.fn.CrunchIterable
 
iterator() - Method in class org.apache.crunch.io.CompositePathIterable
 
iterator() - Method in class org.apache.crunch.util.Tuples.PairIterable
 
iterator() - Method in class org.apache.crunch.util.Tuples.QuadIterable
 
iterator() - Method in class org.apache.crunch.util.Tuples.TripIterable
 
iterator() - Method in class org.apache.crunch.util.Tuples.TupleNIterable
 

J

join(PTable<K, U>) - Method in class org.apache.crunch.impl.dist.collect.PTableBase
 
join(LTable<K, U>, JoinType, JoinStrategy<K, V, U>) - Method in interface org.apache.crunch.lambda.LTable
Join this table to another LTable which has the same key type using the provided JoinType and JoinStrategy
join(LTable<K, U>, JoinType) - Method in interface org.apache.crunch.lambda.LTable
Join this table to another LTable which has the same key type using the provide JoinType and the DefaultJoinStrategy (reduce-side join).
join(LTable<K, U>) - Method in interface org.apache.crunch.lambda.LTable
Inner join this table to another LTable which has the same key type using a reduce-side join
Join - Class in org.apache.crunch.lib
Utilities for joining multiple PTable instances based on a common lastKey.
Join() - Constructor for class org.apache.crunch.lib.Join
 
join(PTable<K, U>, PTable<K, V>, JoinType) - Method in class org.apache.crunch.lib.join.BloomFilterJoinStrategy
 
join(PTable<K, U>, PTable<K, V>, JoinType) - Method in class org.apache.crunch.lib.join.DefaultJoinStrategy
 
join(PTable<K, U>, PTable<K, V>, JoinFn<K, U, V>) - Method in class org.apache.crunch.lib.join.DefaultJoinStrategy
Perform a default join on the given PTable instances using a user-specified JoinFn.
join(K, int, Iterable<Pair<U, V>>, Emitter<Pair<K, Pair<U, V>>>) - Method in class org.apache.crunch.lib.join.FullOuterJoinFn
Performs the actual joining.
join(K, int, Iterable<Pair<U, V>>, Emitter<Pair<K, Pair<U, V>>>) - Method in class org.apache.crunch.lib.join.InnerJoinFn
 
join(PTable<K, U>, PTable<K, V>) - Static method in class org.apache.crunch.lib.Join
Performs an inner join on the specified PTables.
join(K, int, Iterable<Pair<U, V>>, Emitter<Pair<K, Pair<U, V>>>) - Method in class org.apache.crunch.lib.join.JoinFn
Performs the actual joining.
join(PTable<K, U>, PTable<K, V>, JoinType) - Method in interface org.apache.crunch.lib.join.JoinStrategy
Join two tables with the given join type.
join(K, int, Iterable<Pair<U, V>>, Emitter<Pair<K, Pair<U, V>>>) - Method in class org.apache.crunch.lib.join.LeftOuterJoinFn
Performs the actual joining.
join(PTable<K, U>, PTable<K, V>, JoinType) - Method in class org.apache.crunch.lib.join.MapsideJoinStrategy
 
join(K, int, Iterable<Pair<U, V>>, Emitter<Pair<K, Pair<U, V>>>) - Method in class org.apache.crunch.lib.join.RightOuterJoinFn
Performs the actual joining.
join(PTable<K, U>, PTable<K, V>, JoinType) - Method in class org.apache.crunch.lib.join.ShardedJoinStrategy
 
join(PTable<K, U>) - Method in interface org.apache.crunch.PTable
Perform an inner join on this table and the one passed in as an argument on their common keys.
JoinFn<K,U,V> - Class in org.apache.crunch.lib.join
Represents a DoFn for performing joins.
JoinFn(PType<K>, PType<U>) - Constructor for class org.apache.crunch.lib.join.JoinFn
Instantiate with the PType of the value of the left side of the join (used for creating deep copies of values).
JoinStrategy<K,U,V> - Interface in org.apache.crunch.lib.join
Defines a strategy for joining two PTables together on a common key.
JoinType - Enum in org.apache.crunch.lib.join
Specifies the specific behavior of how a join should be performed in terms of requiring matching keys on both sides of the join.
JoinUtils - Class in org.apache.crunch.lib.join
Utilities that are useful in joining multiple data sets via a MapReduce.
JoinUtils() - Constructor for class org.apache.crunch.lib.join.JoinUtils
 
JoinUtils.AvroIndexedRecordPartitioner - Class in org.apache.crunch.lib.join
 
JoinUtils.AvroPairGroupingComparator<T> - Class in org.apache.crunch.lib.join
 
JoinUtils.TupleWritableComparator - Class in org.apache.crunch.lib.join
 
JoinUtils.TupleWritablePartitioner - Class in org.apache.crunch.lib.join
 
jsons(Class<T>) - Static method in class org.apache.crunch.types.avro.Avros
 
jsons(Class<T>) - Static method in class org.apache.crunch.types.writable.Writables
 
jsonString(Class<T>, PTypeFamily) - Static method in class org.apache.crunch.types.PTypes
Constructs a PType for reading a Java type from a JSON string using Jackson's ObjectMapper.

K

KAFKA_EMPTY_RETRY_ATTEMPTS_KEY - Static variable in class org.apache.crunch.kafka.KafkaUtils
Configuration property for the number of retry attempts that will be made to Kafka in the event of getting empty responses.
KAFKA_RETRY_ATTEMPTS_DEFAULT - Static variable in class org.apache.crunch.kafka.KafkaUtils
Default number of retry attempts.
KAFKA_RETRY_ATTEMPTS_DEFAULT_STRING - Static variable in class org.apache.crunch.kafka.KafkaUtils
 
KAFKA_RETRY_ATTEMPTS_KEY - Static variable in class org.apache.crunch.kafka.KafkaUtils
Configuration property for the number of retry attempts that will be made to Kafka.
KAFKA_RETRY_EMPTY_ATTEMPTS_DEFAULT - Static variable in class org.apache.crunch.kafka.KafkaUtils
Default number of empty retry attempts.
KAFKA_RETRY_EMPTY_ATTEMPTS_DEFAULT_STRING - Static variable in class org.apache.crunch.kafka.KafkaUtils
 
KafkaInputFormat - Class in org.apache.crunch.kafka.inputformat
Basic input format for reading data from Kafka.
KafkaInputFormat() - Constructor for class org.apache.crunch.kafka.inputformat.KafkaInputFormat
 
KafkaInputSplit - Class in org.apache.crunch.kafka.inputformat
InputSplit that represent retrieving data from a single TopicPartition between the specified start and end offsets.
KafkaInputSplit() - Constructor for class org.apache.crunch.kafka.inputformat.KafkaInputSplit
Nullary Constructor for creating the instance inside the Mapper instance.
KafkaInputSplit(String, int, long, long) - Constructor for class org.apache.crunch.kafka.inputformat.KafkaInputSplit
Constructs an input split for the provided topic and partition restricting data to be between the startingOffset and endingOffset
KafkaRecordReader<K,V> - Class in org.apache.crunch.kafka.inputformat
A RecordReader for pulling data from Kafka.
KafkaRecordReader() - Constructor for class org.apache.crunch.kafka.inputformat.KafkaRecordReader
 
KafkaSource - Class in org.apache.crunch.kafka
A Crunch Source that will retrieve events from Kafka given start and end offsets.
KafkaSource(Properties, Map<TopicPartition, Pair<Long, Long>>) - Constructor for class org.apache.crunch.kafka.KafkaSource
Constructs a Kafka source that will read data from the Kafka cluster identified by the kafkaConnectionProperties and from the specific topics and partitions identified in the offsets
KafkaSource.BytesDeserializer - Class in org.apache.crunch.kafka
Basic Deserializer which simply wraps the payload as a BytesWritable.
KafkaUtils - Class in org.apache.crunch.kafka
Simple utilities for retrieving offset and Kafka information to assist in setting up and configuring a KafkaSource instance.
KafkaUtils() - Constructor for class org.apache.crunch.kafka.KafkaUtils
 
keep(Integer...) - Method in class org.apache.crunch.contrib.text.TokenizerFactory.Builder
Keep only the specified fields found by the input scanner, counting from zero.
keyClass - Variable in class org.apache.crunch.io.CrunchOutputs.OutputConfig
 
KeyExtraction(PType<V>, Sort.ColumnOrder[]) - Constructor for class org.apache.crunch.lib.sort.SortFns.KeyExtraction
 
keys() - Method in class org.apache.crunch.impl.dist.collect.PTableBase
 
keys() - Method in interface org.apache.crunch.lambda.LTable
Get an LCollection containing just the keys from this table
keys(PTable<K, V>) - Static method in class org.apache.crunch.lib.PTables
Extract the keys from the given PTable<K, V> as a PCollection<K>.
keys() - Method in interface org.apache.crunch.PTable
Returns a PCollection made up of the keys in this PTable.
keyType() - Method in interface org.apache.crunch.lambda.LGroupedTable
Get a PType which can be used to serialize the key part of this grouped table
keyType() - Method in interface org.apache.crunch.lambda.LTable
Get a PType which can be used to serialize the key part of this table
keyValueTableOf(PType<K>, PType<V>) - Static method in class org.apache.crunch.types.avro.Avros
A table type with an Avro type as key and value.
kill() - Method in class org.apache.crunch.impl.spark.SparkRuntime
 
kill() - Method in interface org.apache.crunch.PipelineExecution
Kills the pipeline if it is running, no-op otherwise.

L

LAggregator<V,A> - Class in org.apache.crunch.lambda
Crunch Aggregator expressed as a composition of functional interface implementations
LAggregator(SSupplier<A>, SBiFunction<A, V, A>, SFunction<A, Iterable<V>>) - Constructor for class org.apache.crunch.lambda.LAggregator
 
Lambda - Class in org.apache.crunch.lambda
Entry point for the crunch-lambda API.
Lambda() - Constructor for class org.apache.crunch.lambda.Lambda
 
LAST_N(int) - Static method in class org.apache.crunch.fn.Aggregators
Return the last n values (or fewer if there are fewer values than n).
LCollection<S> - Interface in org.apache.crunch.lambda
Java 8 friendly version of the PCollection interface, allowing distributed operations to be expressed in terms of lambda expressions and method references, instead of creating a new class implementation for each operation.
LCollectionFactory - Interface in org.apache.crunch.lambda
Factory for creating LCollection, LTable and LGroupedTable objects from their corresponding PCollection, PTable and PGroupedTable types.
LDoFn<S,T> - Interface in org.apache.crunch.lambda
A Java lambdas friendly version of the DoFn class.
LDoFnContext<S,T> - Interface in org.apache.crunch.lambda
Context object for implementing distributed operations in terms of Lambda expressions.
leftJoin(PTable<K, U>, PTable<K, V>) - Static method in class org.apache.crunch.lib.Join
Performs a left outer join on the specified PTables.
LeftOuterJoinFn<K,U,V> - Class in org.apache.crunch.lib.join
Used to perform the last step of an left outer join.
LeftOuterJoinFn(PType<K>, PType<U>) - Constructor for class org.apache.crunch.lib.join.LeftOuterJoinFn
 
length() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
length(PCollection<S>) - Static method in class org.apache.crunch.lib.Aggregate
Returns the number of elements in the provided PCollection.
length() - Method in interface org.apache.crunch.PCollection
Returns the number of elements represented by this PCollection.
LGroupedTable<K,V> - Interface in org.apache.crunch.lambda
Java 8 friendly version of the PGroupedTable interface, allowing distributed operations to be expressed in terms of lambda expressions and method references, instead of creating a new class implementation for each operation.
lineParser(String, Class<M>) - Static method in class org.apache.crunch.types.Protos
 
locale(Locale) - Method in class org.apache.crunch.contrib.text.TokenizerFactory.Builder
Sets the Locale to use with the TokenizerFactory returned by this Builder instance.
longs() - Static method in class org.apache.crunch.types.avro.Avros
 
longs() - Method in class org.apache.crunch.types.avro.AvroTypeFamily
 
longs() - Method in interface org.apache.crunch.types.PTypeFamily
 
longs() - Static method in class org.apache.crunch.types.writable.Writables
 
longs() - Method in class org.apache.crunch.types.writable.WritableTypeFamily
 
LTable<K,V> - Interface in org.apache.crunch.lambda
Java 8 friendly version of the PTable interface, allowing distributed operations to be expressed in terms of lambda expressions and method references, instead of creating a new class implementation for each operation.

M

main(String[]) - Static method in class org.apache.crunch.examples.AverageBytesByIP
 
main(String[]) - Static method in class org.apache.crunch.examples.SecondarySortExample
 
main(String[]) - Static method in class org.apache.crunch.examples.SortExample
 
main(String[]) - Static method in class org.apache.crunch.examples.TotalBytesByIP
 
main(String[]) - Static method in class org.apache.crunch.examples.TotalWordCount
 
main(String[]) - Static method in class org.apache.crunch.examples.WordAggregationHBase
 
main(String[]) - Static method in class org.apache.crunch.examples.WordCount
 
makeTuple(Object...) - Method in class org.apache.crunch.types.TupleFactory
 
map(R) - Method in class org.apache.crunch.fn.CompositeMapFn
 
map(V) - Method in class org.apache.crunch.fn.ExtractKeyFn
 
map(T) - Method in class org.apache.crunch.fn.IdentityFn
 
map(Pair<K, V>) - Method in class org.apache.crunch.fn.PairMapFn
 
map(T) - Method in class org.apache.crunch.fn.SDoubleFunction
 
map(T) - Method in class org.apache.crunch.fn.SFunction
 
map(Pair<K, V>) - Method in class org.apache.crunch.fn.SFunction2
 
map(T) - Method in class org.apache.crunch.fn.SPairFunction
 
map(Pair<V1, V2>) - Method in class org.apache.crunch.fn.SwapFn
 
map(SFunction<S, T>, PType<T>) - Method in interface org.apache.crunch.lambda.LCollection
Map the elements of this collection 1-1 through the supplied function.
map(SFunction<S, Pair<K, V>>, PTableType<K, V>) - Method in interface org.apache.crunch.lambda.LCollection
Map the elements of this collection 1-1 through the supplied function to yield an LTable
map(PTable<K1, V1>, Class<? extends Mapper<K1, V1, K2, V2>>, Class<K2>, Class<V2>) - Static method in class org.apache.crunch.lib.Mapred
 
map(PTable<K1, V1>, Class<? extends Mapper<K1, V1, K2, V2>>, Class<K2>, Class<V2>) - Static method in class org.apache.crunch.lib.Mapreduce
 
map(V) - Method in class org.apache.crunch.lib.sort.SortFns.AvroGenericFn
 
map(V) - Method in class org.apache.crunch.lib.sort.SortFns.SingleKeyFn
 
map(V) - Method in class org.apache.crunch.lib.sort.SortFns.TupleKeyFn
 
map(S) - Method in class org.apache.crunch.MapFn
Maps the given input into an instance of the output type.
map(Pair<Object, Iterable<Object>>) - Method in class org.apache.crunch.types.PGroupedTableType.PairIterableMapFn
 
MapDeepCopier<T> - Class in org.apache.crunch.types
 
MapDeepCopier(PType<T>) - Constructor for class org.apache.crunch.types.MapDeepCopier
 
MapFn<S,T> - Class in org.apache.crunch
A DoFn for the common case of emitting exactly one value for each input record.
MapFn() - Constructor for class org.apache.crunch.MapFn
 
MapFunction - Class in org.apache.crunch.impl.spark.fn
 
MapFunction(MapFn, SparkRuntimeContext) - Constructor for class org.apache.crunch.impl.spark.fn.MapFunction
 
mapKeys(MapFn<K, K2>, PType<K2>) - Method in class org.apache.crunch.impl.dist.collect.PTableBase
 
mapKeys(String, MapFn<K, K2>, PType<K2>) - Method in class org.apache.crunch.impl.dist.collect.PTableBase
 
mapKeys(SFunction<K, T>, PType<T>) - Method in interface org.apache.crunch.lambda.LTable
Transform the keys of this table using the given function
mapKeys(PTable<K1, V>, MapFn<K1, K2>, PType<K2>) - Static method in class org.apache.crunch.lib.PTables
Maps a PTable<K1, V> to a PTable<K2, V> using the given MapFn<K1, K2> on the keys of the PTable.
mapKeys(String, PTable<K1, V>, MapFn<K1, K2>, PType<K2>) - Static method in class org.apache.crunch.lib.PTables
Maps a PTable<K1, V> to a PTable<K2, V> using the given MapFn<K1, K2> on the keys of the PTable.
mapKeys(MapFn<K, K2>, PType<K2>) - Method in interface org.apache.crunch.PTable
Returns a PTable that has the same values as this instance, but uses the given function to map the keys.
mapKeys(String, MapFn<K, K2>, PType<K2>) - Method in interface org.apache.crunch.PTable
Returns a PTable that has the same values as this instance, but uses the given function to map the keys.
MapOutputFunction<K,V> - Class in org.apache.crunch.impl.spark.fn
 
MapOutputFunction(SerDe, SerDe) - Constructor for class org.apache.crunch.impl.spark.fn.MapOutputFunction
 
Mapred - Class in org.apache.crunch.lib
Static functions for working with legacy Mappers and Reducers that live under the org.apache.hadoop.mapred.* package as part of Crunch pipelines.
Mapred() - Constructor for class org.apache.crunch.lib.Mapred
 
Mapreduce - Class in org.apache.crunch.lib
Static functions for working with legacy Mappers and Reducers that live under the org.apache.hadoop.mapreduce.* package as part of Crunch pipelines.
Mapreduce() - Constructor for class org.apache.crunch.lib.Mapreduce
 
MapReduceTarget - Interface in org.apache.crunch.io
 
maps(PType<T>) - Static method in class org.apache.crunch.types.avro.Avros
 
maps(PType<T>) - Method in class org.apache.crunch.types.avro.AvroTypeFamily
 
maps(PType<T>) - Method in interface org.apache.crunch.types.PTypeFamily
 
maps(PType<T>) - Static method in class org.apache.crunch.types.writable.Writables
 
maps(PType<T>) - Method in class org.apache.crunch.types.writable.WritableTypeFamily
 
MapsideJoinStrategy<K,U,V> - Class in org.apache.crunch.lib.join
Utility for doing map side joins on a common key between two PTables.
MapsideJoinStrategy() - Constructor for class org.apache.crunch.lib.join.MapsideJoinStrategy
Deprecated.
Use the MapsideJoinStrategy.create() factory method instead
MapsideJoinStrategy(boolean) - Constructor for class org.apache.crunch.lib.join.MapsideJoinStrategy
Deprecated.
Use the MapsideJoinStrategy.create(boolean) factory method instead
mapValues(MapFn<Iterable<V>, U>, PType<U>) - Method in class org.apache.crunch.impl.dist.collect.BaseGroupedTable
 
mapValues(String, MapFn<Iterable<V>, U>, PType<U>) - Method in class org.apache.crunch.impl.dist.collect.BaseGroupedTable
 
mapValues(MapFn<V, U>, PType<U>) - Method in class org.apache.crunch.impl.dist.collect.PTableBase
 
mapValues(String, MapFn<V, U>, PType<U>) - Method in class org.apache.crunch.impl.dist.collect.PTableBase
 
mapValues(SFunction<Stream<V>, T>, PType<T>) - Method in interface org.apache.crunch.lambda.LGroupedTable
Map the values in this LGroupedTable using a custom function.
mapValues(SFunction<V, T>, PType<T>) - Method in interface org.apache.crunch.lambda.LTable
Transform the values of this table using the given function
mapValues(PTable<K, U>, MapFn<U, V>, PType<V>) - Static method in class org.apache.crunch.lib.PTables
Maps a PTable<K, U> to a PTable<K, V> using the given MapFn<U, V> on the values of the PTable.
mapValues(String, PTable<K, U>, MapFn<U, V>, PType<V>) - Static method in class org.apache.crunch.lib.PTables
Maps a PTable<K, U> to a PTable<K, V> using the given MapFn<U, V> on the values of the PTable.
mapValues(PGroupedTable<K, U>, MapFn<Iterable<U>, V>, PType<V>) - Static method in class org.apache.crunch.lib.PTables
An analogue of the mapValues function for PGroupedTable<K, U> collections.
mapValues(String, PGroupedTable<K, U>, MapFn<Iterable<U>, V>, PType<V>) - Static method in class org.apache.crunch.lib.PTables
An analogue of the mapValues function for PGroupedTable<K, U> collections.
mapValues(MapFn<Iterable<V>, U>, PType<U>) - Method in interface org.apache.crunch.PGroupedTable
Maps the Iterable<V> elements of each record to a new type.
mapValues(String, MapFn<Iterable<V>, U>, PType<U>) - Method in interface org.apache.crunch.PGroupedTable
Maps the Iterable<V> elements of each record to a new type.
mapValues(MapFn<V, U>, PType<U>) - Method in interface org.apache.crunch.PTable
Returns a PTable that has the same keys as this instance, but uses the given function to map the values.
mapValues(String, MapFn<V, U>, PType<U>) - Method in interface org.apache.crunch.PTable
Returns a PTable that has the same keys as this instance, but uses the given function to map the values.
markLogged() - Method in exception org.apache.crunch.CrunchRuntimeException
Indicate that this exception has been written to the debug logs.
materialize() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
materialize(PCollection<T>) - Method in class org.apache.crunch.impl.mem.MemPipeline
 
materialize(PCollection<T>) - Method in class org.apache.crunch.impl.mr.MRPipeline
 
materialize(PCollection<T>) - Method in class org.apache.crunch.impl.spark.SparkPipeline
 
materialize() - Method in interface org.apache.crunch.lambda.LCollection
Obtain the contents of this LCollection as a Stream that can be processed locally.
materialize() - Method in interface org.apache.crunch.PCollection
Returns a reference to the data set represented by this PCollection that may be used by the client to read the data locally.
materialize(PCollection<T>) - Method in interface org.apache.crunch.Pipeline
Create the given PCollection and read the data it contains into the returned Collection instance for client use.
materialize(PCollection<T>) - Method in class org.apache.crunch.util.CrunchTool
 
materializeAt(SourceTarget<S>) - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
materializeToMap() - Method in class org.apache.crunch.impl.dist.collect.PTableBase
Returns a Map made up of the keys and values in this PTable.
materializeToMap() - Method in interface org.apache.crunch.PTable
Returns a Map made up of the keys and values in this PTable.
max() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
max(PCollection<S>) - Static method in class org.apache.crunch.lib.Aggregate
Returns the largest numerical element from the input collection.
max() - Method in interface org.apache.crunch.PCollection
Returns a PObject of the maximum element of this instance.
MAX_BIGDECIMALS() - Static method in class org.apache.crunch.fn.Aggregators
Return the maximum of all given BigDecimal values.
MAX_BIGDECIMALS(int) - Static method in class org.apache.crunch.fn.Aggregators
Return the n largest BigDecimal values (or fewer if there are fewer values than n).
MAX_BIGINTS() - Static method in class org.apache.crunch.fn.Aggregators
Return the maximum of all given BigInteger values.
MAX_BIGINTS(int) - Static method in class org.apache.crunch.fn.Aggregators
Return the n largest BigInteger values (or fewer if there are fewer values than n).
MAX_COMPARABLES() - Static method in class org.apache.crunch.fn.Aggregators
Return the maximum of all given Comparable values.
MAX_DOUBLES() - Static method in class org.apache.crunch.fn.Aggregators
Return the maximum of all given double values.
MAX_DOUBLES(int) - Static method in class org.apache.crunch.fn.Aggregators
Return the n largest double values (or fewer if there are fewer values than n).
MAX_FLOATS() - Static method in class org.apache.crunch.fn.Aggregators
Return the maximum of all given float values.
MAX_FLOATS(int) - Static method in class org.apache.crunch.fn.Aggregators
Return the n largest float values (or fewer if there are fewer values than n).
MAX_INTS() - Static method in class org.apache.crunch.fn.Aggregators
Return the maximum of all given int values.
MAX_INTS(int) - Static method in class org.apache.crunch.fn.Aggregators
Return the n largest int values (or fewer if there are fewer values than n).
MAX_LONGS() - Static method in class org.apache.crunch.fn.Aggregators
Return the maximum of all given long values.
MAX_LONGS(int) - Static method in class org.apache.crunch.fn.Aggregators
Return the n largest long values (or fewer if there are fewer values than n).
MAX_N(int, Class<V>) - Static method in class org.apache.crunch.fn.Aggregators
Return the n largest values (or fewer if there are fewer values than n).
MAX_REDUCERS - Static variable in class org.apache.crunch.util.PartitionUtils
Set an upper limit on the number of reducers the Crunch planner will set for an MR job when it tries to determine how many reducers to use based on the input size.
MAX_UNIQUE_N(int, Class<V>) - Static method in class org.apache.crunch.fn.Aggregators
Return the n largest unique values (or fewer if there are fewer values than n).
meanValue(PTable<K, V>) - Static method in class org.apache.crunch.lib.Average
Calculate the mean average value by key for a table with numeric values.
MemPipeline - Class in org.apache.crunch.impl.mem
 
min() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
min(PCollection<S>) - Static method in class org.apache.crunch.lib.Aggregate
Returns the smallest numerical element from the input collection.
min() - Method in interface org.apache.crunch.PCollection
Returns a PObject of the minimum element of this instance.
MIN_BIGDECIMALS() - Static method in class org.apache.crunch.fn.Aggregators
Return the minimum of all given BigDecimal values.
MIN_BIGDECIMALS(int) - Static method in class org.apache.crunch.fn.Aggregators
Return the n smallest BigDecimal values (or fewer if there are fewer values than n).
MIN_BIGINTS() - Static method in class org.apache.crunch.fn.Aggregators
Return the minimum of all given BigInteger values.
MIN_BIGINTS(int) - Static method in class org.apache.crunch.fn.Aggregators
Return the n smallest BigInteger values (or fewer if there are fewer values than n).
MIN_COMPARABLES() - Static method in class org.apache.crunch.fn.Aggregators
Return the minimum of all given Comparable values.
MIN_DOUBLES() - Static method in class org.apache.crunch.fn.Aggregators
Return the minimum of all given double values.
MIN_DOUBLES(int) - Static method in class org.apache.crunch.fn.Aggregators
Return the n smallest double values (or fewer if there are fewer values than n).
MIN_FLOATS() - Static method in class org.apache.crunch.fn.Aggregators
Return the minimum of all given float values.
MIN_FLOATS(int) - Static method in class org.apache.crunch.fn.Aggregators
Return the n smallest float values (or fewer if there are fewer values than n).
MIN_INTS() - Static method in class org.apache.crunch.fn.Aggregators
Return the minimum of all given int values.
MIN_INTS(int) - Static method in class org.apache.crunch.fn.Aggregators
Return the n smallest int values (or fewer if there are fewer values than n).
MIN_LONGS() - Static method in class org.apache.crunch.fn.Aggregators
Return the minimum of all given long values.
MIN_LONGS(int) - Static method in class org.apache.crunch.fn.Aggregators
Return the n smallest long values (or fewer if there are fewer values than n).
MIN_N(int, Class<V>) - Static method in class org.apache.crunch.fn.Aggregators
Return the n smallest values (or fewer if there are fewer values than n).
MIN_UNIQUE_N(int, Class<V>) - Static method in class org.apache.crunch.fn.Aggregators
Returns the n smallest unique values (or fewer if there are fewer unique values than n).
MRCollection - Interface in org.apache.crunch.impl.dist.collect
 
MRJob - Interface in org.apache.crunch.impl.mr
A Hadoop MapReduce job managed by Crunch.
MRJob.State - Enum in org.apache.crunch.impl.mr
A job will be in one of the following states.
MRPipeline - Class in org.apache.crunch.impl.mr
Pipeline implementation that is executed within Hadoop MapReduce.
MRPipeline(Class<?>) - Constructor for class org.apache.crunch.impl.mr.MRPipeline
Instantiate with a default Configuration and name.
MRPipeline(Class<?>, String) - Constructor for class org.apache.crunch.impl.mr.MRPipeline
Instantiate with a custom pipeline name.
MRPipeline(Class<?>, Configuration) - Constructor for class org.apache.crunch.impl.mr.MRPipeline
Instantiate with a custom configuration and default naming.
MRPipeline(Class<?>, String, Configuration) - Constructor for class org.apache.crunch.impl.mr.MRPipeline
Instantiate with a custom name and configuration.
MRPipelineExecution - Interface in org.apache.crunch.impl.mr
 

N

name - Variable in class org.apache.crunch.contrib.io.jdbc.IdentifiableName
 
name(String) - Static method in class org.apache.crunch.CreateOptions
 
name - Variable in class org.apache.crunch.test.Employee
Deprecated.
name - Variable in class org.apache.crunch.test.Person
Deprecated.
nameAndParallelism(String, int) - Static method in class org.apache.crunch.CreateOptions
 
named(String) - Method in class org.apache.crunch.PipelineCallable
Use the given name to identify this instance in the logs.
namedTuples(String, String[], PType[]) - Static method in class org.apache.crunch.types.avro.Avros
 
negateCounts(PTable<K, Long>) - Static method in class org.apache.crunch.lib.TopList
When creating toplists, it is often required to sort by count descending.
newBuilder() - Static method in class org.apache.crunch.kafka.offset.hdfs.Offsets.Builder
Creates a new Builder instance.
newBuilder() - Static method in class org.apache.crunch.kafka.offset.hdfs.Offsets.PartitionOffset.Builder
Creates a new builder instance.
newBuilder() - Static method in class org.apache.crunch.test.Employee
Creates a new Employee RecordBuilder
newBuilder(Employee.Builder) - Static method in class org.apache.crunch.test.Employee
Creates a new Employee RecordBuilder by copying an existing Builder
newBuilder(Employee) - Static method in class org.apache.crunch.test.Employee
Creates a new Employee RecordBuilder by copying an existing Employee instance
newBuilder() - Static method in class org.apache.crunch.test.Person
Creates a new Person RecordBuilder
newBuilder(Person.Builder) - Static method in class org.apache.crunch.test.Person
Creates a new Person RecordBuilder by copying an existing Builder
newBuilder(Person) - Static method in class org.apache.crunch.test.Person
Creates a new Person RecordBuilder by copying an existing Person instance
newReader(Schema) - Static method in class org.apache.crunch.types.avro.Avros
 
newReader(AvroType<T>) - Static method in class org.apache.crunch.types.avro.Avros
 
newWriter(Schema) - Static method in class org.apache.crunch.types.avro.Avros
 
newWriter(AvroType<T>) - Static method in class org.apache.crunch.types.avro.Avros
 
next() - Method in class org.apache.crunch.contrib.text.Tokenizer
Advance this Tokenizer and return the next String from the Scanner.
next() - Method in class org.apache.crunch.util.DoFnIterator
 
nextBoolean() - Method in class org.apache.crunch.contrib.text.Tokenizer
Advance this Tokenizer and return the next Boolean from the Scanner.
nextDouble() - Method in class org.apache.crunch.contrib.text.Tokenizer
Advance this Tokenizer and return the next Double from the Scanner.
nextFloat() - Method in class org.apache.crunch.contrib.text.Tokenizer
Advance this Tokenizer and return the next Float from the Scanner.
nextInt() - Method in class org.apache.crunch.contrib.text.Tokenizer
Advance this Tokenizer and return the next Integer from the Scanner.
nextKeyValue() - Method in class org.apache.crunch.kafka.inputformat.KafkaRecordReader
 
nextLong() - Method in class org.apache.crunch.contrib.text.Tokenizer
Advance this Tokenizer and return the next Long from the Scanner.
none() - Static method in class org.apache.crunch.CreateOptions
 
NoOpDeepCopier<T> - Class in org.apache.crunch.types
A DeepCopier that does nothing, and just returns the input value without copying anything.
not(FilterFn<S>) - Static method in class org.apache.crunch.fn.FilterFns
Accept an entry if the given filter does not accept it.
nulls() - Static method in class org.apache.crunch.types.avro.Avros
 
nulls() - Method in class org.apache.crunch.types.avro.AvroTypeFamily
 
nulls() - Method in interface org.apache.crunch.types.PTypeFamily
 
nulls() - Static method in class org.apache.crunch.types.writable.Writables
 
nulls() - Method in class org.apache.crunch.types.writable.WritableTypeFamily
 
numPartitions() - Method in class org.apache.crunch.impl.spark.SparkPartitioner
 
numReducers(int) - Method in class org.apache.crunch.GroupingOptions.Builder
 

O

of(T, U) - Static method in class org.apache.crunch.Pair
 
of(A, B, C) - Static method in class org.apache.crunch.Tuple3
 
of(A, B, C, D) - Static method in class org.apache.crunch.Tuple4
 
of(Object...) - Static method in class org.apache.crunch.TupleN
 
OffsetReader - Interface in org.apache.crunch.kafka.offset
Reader API that supports reading offset information from an underlying storage mechanism.
Offsets - Class in org.apache.crunch.kafka.offset.hdfs
Simple object to represent a collection of Kafka Topic and Partition offset information to make storing this information easier.
Offsets.Builder - Class in org.apache.crunch.kafka.offset.hdfs
Builder for the Offsets.
Offsets.PartitionOffset - Class in org.apache.crunch.kafka.offset.hdfs
Simple object that represents a specific topic, partition, and its offset value.
Offsets.PartitionOffset.Builder - Class in org.apache.crunch.kafka.offset.hdfs
OffsetWriter - Interface in org.apache.crunch.kafka.offset
Writer for persisting offset information.
OneToManyJoin - Class in org.apache.crunch.lib.join
Optimized join for situations where exactly one value is being joined with any other number of values based on a common key.
OneToManyJoin() - Constructor for class org.apache.crunch.lib.join.OneToManyJoin
 
oneToManyJoin(PTable<K, U>, PTable<K, V>, DoFn<Pair<U, Iterable<V>>, T>, PType<T>) - Static method in class org.apache.crunch.lib.join.OneToManyJoin
Performs a join on two tables, where the left table only contains a single value per key.
oneToManyJoin(PTable<K, U>, PTable<K, V>, DoFn<Pair<U, Iterable<V>>, T>, PType<T>, int) - Static method in class org.apache.crunch.lib.join.OneToManyJoin
Supports a user-specified number of reducers for the one-to-many join.
or(FilterFn<S>, FilterFn<S>) - Static method in class org.apache.crunch.fn.FilterFns
Accept an entry if at least one of the given filters accept it, using short-circuit evaluation.
or(FilterFn<S>...) - Static method in class org.apache.crunch.fn.FilterFns
Accept an entry if at least one of the given filters accept it, using short-circuit evaluation.
Orcs - Class in org.apache.crunch.types.orc
Utilities to create PTypes for ORC serialization / deserialization
Orcs() - Constructor for class org.apache.crunch.types.orc.Orcs
 
orcs(TypeInfo) - Static method in class org.apache.crunch.types.orc.Orcs
Create a PType to directly use OrcStruct as the deserialized format.
OrcUtils - Class in org.apache.crunch.types.orc
 
OrcUtils() - Constructor for class org.apache.crunch.types.orc.OrcUtils
 
order() - Method in class org.apache.crunch.lib.Sort.ColumnOrder
 
org.apache.crunch - package org.apache.crunch
Client-facing API and core abstractions.
org.apache.crunch.contrib - package org.apache.crunch.contrib
User contributions that may be interesting for special applications.
org.apache.crunch.contrib.bloomfilter - package org.apache.crunch.contrib.bloomfilter
Support for creating Bloom Filters.
org.apache.crunch.contrib.io.jdbc - package org.apache.crunch.contrib.io.jdbc
Support for reading data from RDBMS using JDBC
org.apache.crunch.contrib.text - package org.apache.crunch.contrib.text
 
org.apache.crunch.examples - package org.apache.crunch.examples
Example applications demonstrating various aspects of Crunch.
org.apache.crunch.fn - package org.apache.crunch.fn
Commonly used functions for manipulating collections.
org.apache.crunch.impl - package org.apache.crunch.impl
 
org.apache.crunch.impl.dist - package org.apache.crunch.impl.dist
 
org.apache.crunch.impl.dist.collect - package org.apache.crunch.impl.dist.collect
 
org.apache.crunch.impl.mem - package org.apache.crunch.impl.mem
In-memory Pipeline implementation for rapid prototyping and testing.
org.apache.crunch.impl.mr - package org.apache.crunch.impl.mr
A Pipeline implementation that runs on Hadoop MapReduce.
org.apache.crunch.impl.spark - package org.apache.crunch.impl.spark
 
org.apache.crunch.impl.spark.collect - package org.apache.crunch.impl.spark.collect
 
org.apache.crunch.impl.spark.fn - package org.apache.crunch.impl.spark.fn
 
org.apache.crunch.impl.spark.serde - package org.apache.crunch.impl.spark.serde
 
org.apache.crunch.io - package org.apache.crunch.io
Data input and output for Pipelines.
org.apache.crunch.kafka - package org.apache.crunch.kafka
 
org.apache.crunch.kafka.inputformat - package org.apache.crunch.kafka.inputformat
 
org.apache.crunch.kafka.offset - package org.apache.crunch.kafka.offset
 
org.apache.crunch.kafka.offset.hdfs - package org.apache.crunch.kafka.offset.hdfs
 
org.apache.crunch.lambda - package org.apache.crunch.lambda
Alternative Crunch API using Java 8 features to allow construction of pipelines using lambda functions and method references.
org.apache.crunch.lambda.fn - package org.apache.crunch.lambda.fn
Serializable versions of the functional interfaces that ship with Java 8
org.apache.crunch.lib - package org.apache.crunch.lib
Joining, sorting, aggregating, and other commonly used functionality.
org.apache.crunch.lib.join - package org.apache.crunch.lib.join
Inner and outer joins on collections.
org.apache.crunch.lib.sort - package org.apache.crunch.lib.sort
 
org.apache.crunch.test - package org.apache.crunch.test
Utilities for testing Crunch-based applications.
org.apache.crunch.types - package org.apache.crunch.types
Common functionality for business object serialization.
org.apache.crunch.types.avro - package org.apache.crunch.types.avro
Business object serialization using Apache Avro.
org.apache.crunch.types.orc - package org.apache.crunch.types.orc
 
org.apache.crunch.types.writable - package org.apache.crunch.types.writable
Business object serialization using Hadoop's Writables framework.
org.apache.crunch.util - package org.apache.crunch.util
An assorted set of utilities.
org.apache.hadoop.mapred - package org.apache.hadoop.mapred
 
outputConf(String, String) - Method in interface org.apache.crunch.Target
Adds the given key-value pair to the Configuration instance that is used to write this Target.
OutputConfig(FormatBundle<OutputFormat<K, V>>, Class<K>, Class<V>) - Constructor for class org.apache.crunch.io.CrunchOutputs.OutputConfig
 
OutputConverterFunction<K,V,S> - Class in org.apache.crunch.impl.spark.fn
 
OutputConverterFunction(Converter<K, V, S, ?>) - Constructor for class org.apache.crunch.impl.spark.fn.OutputConverterFunction
 
OutputHandler - Interface in org.apache.crunch.io
 
outputKey(S) - Method in interface org.apache.crunch.types.Converter
 
outputValue(S) - Method in interface org.apache.crunch.types.Converter
 
override(ReaderWriterFactory) - Method in class org.apache.crunch.types.avro.AvroMode
overridePathProperties(Configuration) - Method in class org.apache.crunch.test.TemporaryPath
Set all keys specified in the constructor to temporary directories.

P

Pair<K,V> - Class in org.apache.crunch
A convenience class for two-element Tuples.
Pair(K, V) - Constructor for class org.apache.crunch.Pair
 
PAIR - Static variable in class org.apache.crunch.types.TupleFactory
 
pair2tupleFunc() - Static method in class org.apache.crunch.impl.spark.GuavaUtils
 
pairAggregator(Aggregator<V1>, Aggregator<V2>) - Static method in class org.apache.crunch.fn.Aggregators
Apply separate aggregators to each component of a Pair.
PairFlatMapDoFn<T,K,V> - Class in org.apache.crunch.impl.spark.fn
 
PairFlatMapDoFn(DoFn<T, Pair<K, V>>, SparkRuntimeContext) - Constructor for class org.apache.crunch.impl.spark.fn.PairFlatMapDoFn
 
PairIterable(Iterable<S>, Iterable<T>) - Constructor for class org.apache.crunch.util.Tuples.PairIterable
 
PairIterableMapFn(MapFn<Object, K>, MapFn<Object, V>) - Constructor for class org.apache.crunch.types.PGroupedTableType.PairIterableMapFn
 
PairMapFn<K,V,S,T> - Class in org.apache.crunch.fn
 
PairMapFn(MapFn<K, S>, MapFn<V, T>) - Constructor for class org.apache.crunch.fn.PairMapFn
 
PairMapFunction<K,V,S> - Class in org.apache.crunch.impl.spark.fn
 
PairMapFunction(MapFn<Pair<K, V>, S>, SparkRuntimeContext) - Constructor for class org.apache.crunch.impl.spark.fn.PairMapFunction
 
PairMapIterableFunction<K,V,S,T> - Class in org.apache.crunch.impl.spark.fn
 
PairMapIterableFunction(MapFn<Pair<K, List<V>>, Pair<S, Iterable<T>>>, SparkRuntimeContext) - Constructor for class org.apache.crunch.impl.spark.fn.PairMapIterableFunction
 
pairs(PType<V1>, PType<V2>) - Static method in class org.apache.crunch.types.avro.Avros
 
pairs(PType<V1>, PType<V2>) - Method in class org.apache.crunch.types.avro.AvroTypeFamily
 
pairs(PType<V1>, PType<V2>) - Method in interface org.apache.crunch.types.PTypeFamily
 
pairs(PType<V1>, PType<V2>) - Static method in class org.apache.crunch.types.writable.Writables
 
pairs(PType<V1>, PType<V2>) - Method in class org.apache.crunch.types.writable.WritableTypeFamily
 
PairValueComparator(boolean) - Constructor for class org.apache.crunch.lib.Aggregate.PairValueComparator
 
parallelDo(DoFn<S, T>, PType<T>) - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
parallelDo(String, DoFn<S, T>, PType<T>) - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
parallelDo(String, DoFn<S, T>, PType<T>, ParallelDoOptions) - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
parallelDo(DoFn<S, Pair<K, V>>, PTableType<K, V>) - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
parallelDo(String, DoFn<S, Pair<K, V>>, PTableType<K, V>) - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
parallelDo(String, DoFn<S, Pair<K, V>>, PTableType<K, V>, ParallelDoOptions) - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
parallelDo(DoFn<S, T>, PType<T>) - Method in interface org.apache.crunch.lambda.LCollection
Transform this LCollection using a standard Crunch DoFn
parallelDo(DoFn<S, Pair<K, V>>, PTableType<K, V>) - Method in interface org.apache.crunch.lambda.LCollection
Transform this LCollection to an LTable using a standard Crunch DoFn
parallelDo(LDoFn<S, T>, PType<T>) - Method in interface org.apache.crunch.lambda.LCollection
Transform this LCollection using a Lambda-friendly LDoFn.
parallelDo(LDoFn<S, Pair<K, V>>, PTableType<K, V>) - Method in interface org.apache.crunch.lambda.LCollection
Transform this LCollection using a Lambda-friendly LDoFn.
parallelDo(DoFn<S, T>, PType<T>) - Method in interface org.apache.crunch.PCollection
Applies the given doFn to the elements of this PCollection and returns a new PCollection that is the output of this processing.
parallelDo(String, DoFn<S, T>, PType<T>) - Method in interface org.apache.crunch.PCollection
Applies the given doFn to the elements of this PCollection and returns a new PCollection that is the output of this processing.
parallelDo(String, DoFn<S, T>, PType<T>, ParallelDoOptions) - Method in interface org.apache.crunch.PCollection
Applies the given doFn to the elements of this PCollection and returns a new PCollection that is the output of this processing.
parallelDo(DoFn<S, Pair<K, V>>, PTableType<K, V>) - Method in interface org.apache.crunch.PCollection
Similar to the other parallelDo instance, but returns a PTable instance instead of a PCollection.
parallelDo(String, DoFn<S, Pair<K, V>>, PTableType<K, V>) - Method in interface org.apache.crunch.PCollection
Similar to the other parallelDo instance, but returns a PTable instance instead of a PCollection.
parallelDo(String, DoFn<S, Pair<K, V>>, PTableType<K, V>, ParallelDoOptions) - Method in interface org.apache.crunch.PCollection
Similar to the other parallelDo instance, but returns a PTable instance instead of a PCollection.
ParallelDoOptions - Class in org.apache.crunch
Container class that includes optional information about a parallelDo operation applied to a PCollection.
ParallelDoOptions.Builder - Class in org.apache.crunch
 
parallelism(int) - Static method in class org.apache.crunch.CreateOptions
 
Parse - Class in org.apache.crunch.contrib.text
Methods for parsing instances of PCollection<String> into PCollection's of strongly-typed tuples.
parse(String, PCollection<String>, Extractor<T>) - Static method in class org.apache.crunch.contrib.text.Parse
Parses the lines of the input PCollection<String> and returns a PCollection<T> using the given Extractor<T>.
parse(String, PCollection<String>, PTypeFamily, Extractor<T>) - Static method in class org.apache.crunch.contrib.text.Parse
Parses the lines of the input PCollection<String> and returns a PCollection<T> using the given Extractor<T> that uses the given PTypeFamily.
parseTable(String, PCollection<String>, Extractor<Pair<K, V>>) - Static method in class org.apache.crunch.contrib.text.Parse
Parses the lines of the input PCollection<String> and returns a PTable<K, V> using the given Extractor<Pair<K, V>>.
parseTable(String, PCollection<String>, PTypeFamily, Extractor<Pair<K, V>>) - Static method in class org.apache.crunch.contrib.text.Parse
Parses the lines of the input PCollection<String> and returns a PTable<K, V> using the given Extractor<Pair<K, V>> that uses the given PTypeFamily.
partition - Variable in class org.apache.crunch.impl.spark.IntByteArray
 
PartitionedMapOutputFunction<K,V> - Class in org.apache.crunch.impl.spark.fn
 
PartitionedMapOutputFunction(SerDe<K>, SerDe<V>, PGroupedTableType<K, V>, int, GroupingOptions, SparkRuntimeContext) - Constructor for class org.apache.crunch.impl.spark.fn.PartitionedMapOutputFunction
 
PARTITIONER_PATH - Static variable in class org.apache.crunch.lib.sort.TotalOrderPartitioner
 
partitionerClass(Class<? extends Partitioner>) - Method in class org.apache.crunch.GroupingOptions.Builder
 
PartitionUtils - Class in org.apache.crunch.util
Helper functions and settings for determining the number of reducers to use in a pipeline job created by the Crunch planner.
PartitionUtils() - Constructor for class org.apache.crunch.util.PartitionUtils
 
PathTarget - Interface in org.apache.crunch.io
A target whose output goes to a given path on a file system.
PCollection<S> - Interface in org.apache.crunch
A representation of an immutable, distributed collection of elements that is the fundamental target of computations in Crunch.
PCollectionFactory - Interface in org.apache.crunch.impl.dist.collect
 
PCollectionImpl<S> - Class in org.apache.crunch.impl.dist.collect
 
PCollectionImpl(String, DistributedPipeline) - Constructor for class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
PCollectionImpl(String, DistributedPipeline, ParallelDoOptions) - Constructor for class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
PCollectionImpl.Visitor - Interface in org.apache.crunch.impl.dist.collect
 
PERSIST_TIME_FORMAT - Static variable in class org.apache.crunch.kafka.offset.hdfs.HDFSOffsetWriter
Custom formatter for translating the times into valid file names.
persistenceTimeToFileName(long) - Static method in class org.apache.crunch.kafka.offset.hdfs.HDFSOffsetWriter
Converts a persistedTime into a file name for persisting the offsets.
Person - Class in org.apache.crunch.test
 
Person() - Constructor for class org.apache.crunch.test.Person
Default constructor.
Person(CharSequence, Integer, List<CharSequence>) - Constructor for class org.apache.crunch.test.Person
All-args constructor.
Person.Builder - Class in org.apache.crunch.test
RecordBuilder for Person instances.
PGroupedTable<K,V> - Interface in org.apache.crunch
The Crunch representation of a grouped PTable, which corresponds to the output of the shuffle phase of a MapReduce job.
PGroupedTableImpl<K,V> - Class in org.apache.crunch.impl.spark.collect
 
PGroupedTableType<K,V> - Class in org.apache.crunch.types
The PType instance for PGroupedTable instances.
PGroupedTableType(PTableType<K, V>) - Constructor for class org.apache.crunch.types.PGroupedTableType
 
PGroupedTableType.PairIterableMapFn<K,V> - Class in org.apache.crunch.types
 
Pipeline - Interface in org.apache.crunch
Manages the state of a pipeline execution.
PipelineCallable<Output> - Class in org.apache.crunch
A specialization of Callable that executes some sequential logic on the client machine as part of an overall Crunch pipeline in order to generate zero or more outputs, some of which may be PCollection instances that are processed by other jobs in the pipeline.
PipelineCallable() - Constructor for class org.apache.crunch.PipelineCallable
 
PipelineCallable.Status - Enum in org.apache.crunch
 
PipelineExecution - Interface in org.apache.crunch
A handle to allow clients to control a Crunch pipeline as it runs.
PipelineExecution.Status - Enum in org.apache.crunch
 
PipelineResult - Class in org.apache.crunch
Container for the results of a call to run or done on the Pipeline interface that includes details and statistics about the component stages of the data pipeline.
PipelineResult(List<PipelineResult.StageResult>, PipelineExecution.Status) - Constructor for class org.apache.crunch.PipelineResult
 
PipelineResult.StageResult - Class in org.apache.crunch
 
plan() - Method in class org.apache.crunch.impl.mr.MRPipeline
 
PObject<T> - Interface in org.apache.crunch
A PObject represents a singleton object value that results from a distributed computation.
process(S, Emitter<Pair<String, BloomFilter>>) - Method in class org.apache.crunch.contrib.bloomfilter.BloomFilterFn
 
process(S, Emitter<T>) - Method in class org.apache.crunch.DoFn
Processes the records from a PCollection.
process(T, Emitter<T>) - Method in class org.apache.crunch.FilterFn
 
process(T, Emitter<Double>) - Method in class org.apache.crunch.fn.SDoubleFlatMapFunction
 
process(T, Emitter<R>) - Method in class org.apache.crunch.fn.SFlatMapFunction
 
process(Pair<K, V>, Emitter<R>) - Method in class org.apache.crunch.fn.SFlatMapFunction2
 
process(T, Emitter<Pair<K, V>>) - Method in class org.apache.crunch.fn.SPairFlatMapFunction
 
process(LDoFnContext<S, T>) - Method in interface org.apache.crunch.lambda.LDoFn
 
process(Pair<Integer, Iterable<Pair<K, V>>>, Emitter<Pair<Integer, Pair<K, V>>>) - Method in class org.apache.crunch.lib.Aggregate.TopKCombineFn
 
process(Pair<K, V>, Emitter<Pair<Integer, Pair<K, V>>>) - Method in class org.apache.crunch.lib.Aggregate.TopKFn
 
process(Pair<Pair<K, Integer>, Iterable<Pair<U, V>>>, Emitter<Pair<K, Pair<U, V>>>) - Method in class org.apache.crunch.lib.join.JoinFn
Split up the input record to make coding a bit more manageable.
process(S, Emitter<T>) - Method in class org.apache.crunch.MapFn
 
Protos - Class in org.apache.crunch.types
Utility functions for working with protocol buffers in Crunch.
Protos() - Constructor for class org.apache.crunch.types.Protos
 
protos(Class<T>, PTypeFamily) - Static method in class org.apache.crunch.types.PTypes
Constructs a PType for the given protocol buffer.
protos(Class<T>, PTypeFamily, SerializableSupplier<ExtensionRegistry>) - Static method in class org.apache.crunch.types.PTypes
Constructs a PType for a protocol buffer, using the given SerializableSupplier to provide an ExtensionRegistry to use in reading the given protobuf.
PTable<K,V> - Interface in org.apache.crunch
A sub-interface of PCollection that represents an immutable, distributed multi-map of keys and values.
PTableBase<K,V> - Class in org.apache.crunch.impl.dist.collect
 
PTableBase(String, DistributedPipeline) - Constructor for class org.apache.crunch.impl.dist.collect.PTableBase
 
PTableBase(String, DistributedPipeline, ParallelDoOptions) - Constructor for class org.apache.crunch.impl.dist.collect.PTableBase
 
PTables - Class in org.apache.crunch.lib
Methods for performing common operations on PTables.
PTables() - Constructor for class org.apache.crunch.lib.PTables
 
PTableType<K,V> - Interface in org.apache.crunch.types
An extension of PType specifically for PTable objects.
ptf() - Method in interface org.apache.crunch.lambda.LCollection
Get the PTypeFamily representing how elements of this collection may be serialized.
ptype(PType<Pair<V1, V2>>) - Static method in class org.apache.crunch.fn.SwapFn
 
pType() - Method in interface org.apache.crunch.lambda.LCollection
Get the PType representing how elements of this collection may be serialized.
pType() - Method in interface org.apache.crunch.lambda.LTable
Get the underlying PTableType used to serialize key/value pairs in this table
pType(PType<V>) - Static method in class org.apache.crunch.lib.Quantiles.Result
Create a PType for the result type, to be stored as a derived type from Crunch primitives
PType<T> - Interface in org.apache.crunch.types
A PType defines a mapping between a data type that is used in a Crunch pipeline and a serialization and storage format that is used to read/write data from/to HDFS.
PTypeFamily - Interface in org.apache.crunch.types
An abstract factory for creating PType instances that have the same serialization/storage backing format.
PTypes - Class in org.apache.crunch.types
Utility functions for creating common types of derived PTypes, e.g., for JSON data, protocol buffers, and Thrift records.
PTypes() - Constructor for class org.apache.crunch.types.PTypes
 
PTypeUtils - Class in org.apache.crunch.types
Utilities for converting between PTypes from different PTypeFamily implementations.
put(int, Object) - Method in class org.apache.crunch.test.Employee
 
put(int, Object) - Method in class org.apache.crunch.test.Person
 

Q

quadAggregator(Aggregator<V1>, Aggregator<V2>, Aggregator<V3>, Aggregator<V4>) - Static method in class org.apache.crunch.fn.Aggregators
Apply separate aggregators to each component of a Tuple4.
QuadIterable(Iterable<A>, Iterable<B>, Iterable<C>, Iterable<D>) - Constructor for class org.apache.crunch.util.Tuples.QuadIterable
 
quads(PType<V1>, PType<V2>, PType<V3>, PType<V4>) - Static method in class org.apache.crunch.types.avro.Avros
 
quads(PType<V1>, PType<V2>, PType<V3>, PType<V4>) - Method in class org.apache.crunch.types.avro.AvroTypeFamily
 
quads(PType<V1>, PType<V2>, PType<V3>, PType<V4>) - Method in interface org.apache.crunch.types.PTypeFamily
 
quads(PType<V1>, PType<V2>, PType<V3>, PType<V4>) - Static method in class org.apache.crunch.types.writable.Writables
 
quads(PType<V1>, PType<V2>, PType<V3>, PType<V4>) - Method in class org.apache.crunch.types.writable.WritableTypeFamily
 
Quantiles - Class in org.apache.crunch.lib
 
Quantiles() - Constructor for class org.apache.crunch.lib.Quantiles
 
quantiles - Variable in class org.apache.crunch.lib.Quantiles.Result
 
Quantiles.Result<V> - Class in org.apache.crunch.lib
Output type for storing the results of a Quantiles computation

R

read(Source<S>) - Method in class org.apache.crunch.impl.dist.DistributedPipeline
 
read(Source<S>, String) - Method in class org.apache.crunch.impl.dist.DistributedPipeline
 
read(TableSource<K, V>) - Method in class org.apache.crunch.impl.dist.DistributedPipeline
 
read(TableSource<K, V>, String) - Method in class org.apache.crunch.impl.dist.DistributedPipeline
 
read(Source<T>) - Method in class org.apache.crunch.impl.mem.MemPipeline
 
read(Source<T>, String) - Method in class org.apache.crunch.impl.mem.MemPipeline
 
read(TableSource<K, V>) - Method in class org.apache.crunch.impl.mem.MemPipeline
 
read(TableSource<K, V>, String) - Method in class org.apache.crunch.impl.mem.MemPipeline
 
read(FileSystem, Path) - Method in interface org.apache.crunch.io.FileReaderFactory
 
read(Configuration) - Method in interface org.apache.crunch.io.ReadableSource
Returns an Iterable that contains the contents of this source.
read(Configuration) - Method in class org.apache.crunch.kafka.KafkaSource
 
read(Source<T>) - Method in interface org.apache.crunch.Pipeline
Converts the given Source into a PCollection that is available to jobs run using this Pipeline instance.
read(Source<T>, String) - Method in interface org.apache.crunch.Pipeline
Converts the given Source into a PCollection that is available to jobs run using this Pipeline instance.
read(TableSource<K, V>) - Method in interface org.apache.crunch.Pipeline
A version of the read method for TableSource instances that map to PTables.
read(TableSource<K, V>, String) - Method in interface org.apache.crunch.Pipeline
A version of the read method for TableSource instances that map to PTables.
read(TaskInputOutputContext<?, ?, ?, ?>) - Method in interface org.apache.crunch.ReadableData
Read the data referenced by this instance within the given context.
read(Source<T>) - Method in class org.apache.crunch.util.CrunchTool
 
read(TableSource<K, V>) - Method in class org.apache.crunch.util.CrunchTool
 
read(TaskInputOutputContext<?, ?, ?, ?>) - Method in class org.apache.crunch.util.DelegatingReadableData
 
read(Configuration, Path) - Static method in class org.apache.crunch.util.DistCache
 
read(TaskInputOutputContext<?, ?, ?, ?>) - Method in class org.apache.crunch.util.UnionReadableData
 
ReadableData<T> - Interface in org.apache.crunch
Represents the contents of a data source that can be read on the cluster from within one of the tasks running as part of a Crunch pipeline.
ReadableSource<T> - Interface in org.apache.crunch.io
An extension of the Source interface that indicates that a Source instance may be read as a series of records by the client code.
ReadableSourceTarget<T> - Interface in org.apache.crunch.io
An interface that indicates that a SourceTarget instance can be read into the local client.
ReaderWriterFactory - Interface in org.apache.crunch.types.avro
Interface for accessing DatumReader, DatumWriter, and Data classes.
readFields(DataInput) - Method in class org.apache.crunch.contrib.io.jdbc.IdentifiableName
 
readFields(ResultSet) - Method in class org.apache.crunch.contrib.io.jdbc.IdentifiableName
 
readFields(DataInput) - Method in class org.apache.crunch.io.FormatBundle
 
readFields(DataInput) - Method in class org.apache.crunch.kafka.inputformat.KafkaInputSplit
 
readFields(DataInput) - Method in class org.apache.crunch.types.writable.TupleWritable
readFields(DataInput) - Method in class org.apache.crunch.types.writable.UnionWritable
 
readLatestOffsets() - Method in class org.apache.crunch.kafka.offset.hdfs.HDFSOffsetReader
 
readLatestOffsets() - Method in interface org.apache.crunch.kafka.offset.OffsetReader
Reads the last stored offsets.
readOffsets(long) - Method in class org.apache.crunch.kafka.offset.AbstractOffsetReader
 
readOffsets(long) - Method in class org.apache.crunch.kafka.offset.hdfs.HDFSOffsetReader
 
readOffsets(long) - Method in interface org.apache.crunch.kafka.offset.OffsetReader
Reads the offsets for a given persistedOffsetTime.
readTextFile(String) - Method in class org.apache.crunch.impl.dist.DistributedPipeline
 
readTextFile(String) - Method in class org.apache.crunch.impl.mem.MemPipeline
 
readTextFile(String) - Method in interface org.apache.crunch.Pipeline
A convenience method for reading a text file.
readTextFile(String) - Method in class org.apache.crunch.util.CrunchTool
 
records(Class<T>) - Static method in class org.apache.crunch.types.avro.Avros
 
records(Class<T>) - Method in class org.apache.crunch.types.avro.AvroTypeFamily
 
records(Class<T>) - Method in interface org.apache.crunch.types.PTypeFamily
 
records(Class<T>) - Static method in class org.apache.crunch.types.writable.Writables
 
records(Class<T>) - Method in class org.apache.crunch.types.writable.WritableTypeFamily
 
reduce(PGroupedTable<K1, V1>, Class<? extends Reducer<K1, V1, K2, V2>>, Class<K2>, Class<V2>) - Static method in class org.apache.crunch.lib.Mapred
 
reduce(PGroupedTable<K1, V1>, Class<? extends Reducer<K1, V1, K2, V2>>, Class<K2>, Class<V2>) - Static method in class org.apache.crunch.lib.Mapreduce
 
ReduceGroupingFunction - Class in org.apache.crunch.impl.spark.fn
 
ReduceGroupingFunction(GroupingOptions, PGroupedTableType, SparkRuntimeContext) - Constructor for class org.apache.crunch.impl.spark.fn.ReduceGroupingFunction
 
ReduceInputFunction<K,V> - Class in org.apache.crunch.impl.spark.fn
 
ReduceInputFunction(SerDe<K>, SerDe<V>) - Constructor for class org.apache.crunch.impl.spark.fn.ReduceInputFunction
 
reduceValues(SBinaryOperator<V>) - Method in interface org.apache.crunch.lambda.LGroupedTable
Reduce the values for each key using the an associative binary operator.
REFLECT - Static variable in class org.apache.crunch.types.avro.AvroMode
Default mode to use for reading and writing Reflect types.
REFLECT_DATA_FACTORY - Static variable in class org.apache.crunch.types.avro.Avros
Deprecated.
as of 0.9.0; use AvroMode.REFLECT.override(ReaderWriterFactory)
REFLECT_DATA_FACTORY_CLASS - Static variable in class org.apache.crunch.types.avro.Avros
The name of the configuration parameter that tracks which reflection factory to use.
ReflectDataFactory - Class in org.apache.crunch.types.avro
A Factory class for constructing Avro reflection-related objects.
ReflectDataFactory() - Constructor for class org.apache.crunch.types.avro.ReflectDataFactory
 
reflects(Class<T>) - Static method in class org.apache.crunch.types.avro.Avros
 
reflects(Class<T>, Schema) - Static method in class org.apache.crunch.types.avro.Avros
 
reflects(Class<T>) - Static method in class org.apache.crunch.types.orc.Orcs
Create a PType which uses reflection to serialize/deserialize java POJOs to/from ORC.
register(Class<T>, AvroType<T>) - Static method in class org.apache.crunch.types.avro.Avros
 
register(Class<T>, WritableType<T, ? extends Writable>) - Static method in class org.apache.crunch.types.writable.Writables
 
registerComparable(Class<? extends WritableComparable>) - Static method in class org.apache.crunch.types.writable.Writables
Registers a WritableComparable class so that it can be used for comparing the fields inside of tuple types (e.g., pairs, trips, tupleN, etc.) for use in sorts and secondary sorts.
registerComparable(Class<? extends WritableComparable>, int) - Static method in class org.apache.crunch.types.writable.Writables
Registers a WritableComparable class with a given integer code to use for serializing and deserializing instances of this class that are defined inside of tuple types (e.g., pairs, trips, tupleN, etc.) Unregistered Writables are always serialized to bytes and cannot be used in comparisons (e.g., sorts and secondary sorts) according to their underlying types.
REJECT_ALL() - Static method in class org.apache.crunch.fn.FilterFns
Reject everything.
remove() - Method in class org.apache.crunch.util.DoFnIterator
 
replicas(int) - Method in class org.apache.crunch.CachingOptions.Builder
 
replicas() - Method in class org.apache.crunch.CachingOptions
Returns the number of replicas of the data that should be maintained in the cache.
requireSortedKeys() - Method in class org.apache.crunch.GroupingOptions.Builder
 
requireSortedKeys() - Method in class org.apache.crunch.GroupingOptions
 
reservoirSample(PCollection<T>, int) - Static method in class org.apache.crunch.lib.Sample
Select a fixed number of elements from the given PCollection with each element equally likely to be included in the sample.
reservoirSample(PCollection<T>, int, Long) - Static method in class org.apache.crunch.lib.Sample
A version of the reservoir sampling algorithm that uses a given seed, primarily for testing purposes.
reset() - Method in interface org.apache.crunch.Aggregator
Clears the internal state of this Aggregator and prepares it for the values associated with the next key.
reset() - Method in class org.apache.crunch.lambda.LAggregator
 
Result(long, Iterable<Pair<Double, V>>) - Constructor for class org.apache.crunch.lib.Quantiles.Result
 
results() - Method in interface org.apache.crunch.Aggregator
Returns the current aggregated state of this instance.
results() - Method in class org.apache.crunch.lambda.LAggregator
 
ReverseAvroComparator<T> - Class in org.apache.crunch.lib.sort
 
ReverseAvroComparator() - Constructor for class org.apache.crunch.lib.sort.ReverseAvroComparator
 
ReverseWritableComparator<T> - Class in org.apache.crunch.lib.sort
 
ReverseWritableComparator() - Constructor for class org.apache.crunch.lib.sort.ReverseWritableComparator
 
rightJoin(PTable<K, U>, PTable<K, V>) - Static method in class org.apache.crunch.lib.Join
Performs a right outer join on the specified PTables.
RightOuterJoinFn<K,U,V> - Class in org.apache.crunch.lib.join
Used to perform the last step of an right outer join.
RightOuterJoinFn(PType<K>, PType<U>) - Constructor for class org.apache.crunch.lib.join.RightOuterJoinFn
 
run(String[]) - Method in class org.apache.crunch.examples.AverageBytesByIP
 
run(String[]) - Method in class org.apache.crunch.examples.SecondarySortExample
 
run(String[]) - Method in class org.apache.crunch.examples.SortExample
 
run(String[]) - Method in class org.apache.crunch.examples.TotalBytesByIP
 
run(String[]) - Method in class org.apache.crunch.examples.TotalWordCount
 
run(String[]) - Method in class org.apache.crunch.examples.WordAggregationHBase
 
run(String[]) - Method in class org.apache.crunch.examples.WordCount
 
run() - Method in class org.apache.crunch.impl.mem.MemPipeline
 
run() - Method in class org.apache.crunch.impl.mr.MRPipeline
 
run() - Method in class org.apache.crunch.impl.spark.SparkPipeline
 
run() - Method in interface org.apache.crunch.Pipeline
Constructs and executes a series of MapReduce jobs in order to write data to the output targets.
run() - Method in class org.apache.crunch.util.CrunchTool
 
runAsync() - Method in class org.apache.crunch.impl.mem.MemPipeline
 
runAsync() - Method in class org.apache.crunch.impl.mr.MRPipeline
 
runAsync() - Method in class org.apache.crunch.impl.spark.SparkPipeline
 
runAsync() - Method in interface org.apache.crunch.Pipeline
Constructs and starts a series of MapReduce jobs in order ot write data to the output targets, but returns a ListenableFuture to allow clients to control job execution.
runAsync() - Method in class org.apache.crunch.util.CrunchTool
 
runSingleThreaded() - Method in class org.apache.crunch.PipelineCallable
Override this method to indicate to the planner that this instance should not be run at the same time as any other PipelineCallable instances.

S

salary - Variable in class org.apache.crunch.test.Employee
Deprecated.
Sample - Class in org.apache.crunch.lib
Methods for performing random sampling in a distributed fashion, either by accepting each record in a PCollection with an independent probability in order to sample some fraction of the overall data set, or by using reservoir sampling in order to pull a uniform or weighted sample of fixed size from a PCollection of an unknown size.
Sample() - Constructor for class org.apache.crunch.lib.Sample
 
sample(PCollection<S>, double) - Static method in class org.apache.crunch.lib.Sample
Output records from the given PCollection with the given probability.
sample(PCollection<S>, Long, double) - Static method in class org.apache.crunch.lib.Sample
Output records from the given PCollection using a given seed.
sample(PTable<K, V>, double) - Static method in class org.apache.crunch.lib.Sample
A PTable<K, V> analogue of the sample function.
sample(PTable<K, V>, Long, double) - Static method in class org.apache.crunch.lib.Sample
A PTable<K, V> analogue of the sample function, with the seed argument exposed for testing purposes.
SAMPLE_UNIQUE_ELEMENTS(int) - Static method in class org.apache.crunch.fn.Aggregators
Collect a sample of unique elements from the input, where 'unique' is defined by the equals method for the input objects.
SBiConsumer<K,V> - Interface in org.apache.crunch.lambda.fn
Serializable version of the Java BiConsumer functional interface.
SBiFunction<K,V,T> - Interface in org.apache.crunch.lambda.fn
Serializable version of the Java BiFunction functional interface.
SBinaryOperator<T> - Interface in org.apache.crunch.lambda.fn
Serializable version of the Java BinaryOperator functional interface.
scaleFactor() - Method in class org.apache.crunch.DoFn
Returns an estimate of how applying this function to a PCollection will cause it to change in side.
scaleFactor() - Method in class org.apache.crunch.FilterFn
 
scaleFactor() - Method in class org.apache.crunch.fn.CompositeMapFn
 
scaleFactor() - Method in class org.apache.crunch.fn.ExtractKeyFn
 
scaleFactor() - Method in class org.apache.crunch.fn.PairMapFn
 
scaleFactor() - Method in class org.apache.crunch.MapFn
 
SCHEMA$ - Static variable in class org.apache.crunch.test.Employee
 
SCHEMA$ - Static variable in class org.apache.crunch.test.Person
 
SConsumer<T> - Interface in org.apache.crunch.lambda.fn
Serializable version of the Java Consumer functional interface.
SDoubleFlatMapFunction<T> - Class in org.apache.crunch.fn
A Crunch-compatible abstract base class for Spark's DoubleFlatMapFunction.
SDoubleFlatMapFunction() - Constructor for class org.apache.crunch.fn.SDoubleFlatMapFunction
 
SDoubleFunction<T> - Class in org.apache.crunch.fn
A Crunch-compatible abstract base class for Spark's DoubleFunction.
SDoubleFunction() - Constructor for class org.apache.crunch.fn.SDoubleFunction
 
second() - Method in class org.apache.crunch.Pair
 
second() - Method in class org.apache.crunch.Tuple3
 
second() - Method in class org.apache.crunch.Tuple4
 
SecondarySort - Class in org.apache.crunch.lib
Utilities for performing a secondary sort on a PTable<K, Pair<V1, V2>> collection.
SecondarySort() - Constructor for class org.apache.crunch.lib.SecondarySort
 
SecondarySortExample - Class in org.apache.crunch.examples
 
SecondarySortExample() - Constructor for class org.apache.crunch.examples.SecondarySortExample
 
sequenceFile(String, Class<T>) - Static method in class org.apache.crunch.io.At
Creates a SourceTarget<T> instance from the SequenceFile(s) at the given path name from the value field of each key-value pair in the SequenceFile(s).
sequenceFile(Path, Class<T>) - Static method in class org.apache.crunch.io.At
Creates a SourceTarget<T> instance from the SequenceFile(s) at the given Path from the value field of each key-value pair in the SequenceFile(s).
sequenceFile(String, PType<T>) - Static method in class org.apache.crunch.io.At
Creates a SourceTarget<T> instance from the SequenceFile(s) at the given path name from the value field of each key-value pair in the SequenceFile(s).
sequenceFile(Path, PType<T>) - Static method in class org.apache.crunch.io.At
Creates a SourceTarget<T> instance from the SequenceFile(s) at the given Path from the value field of each key-value pair in the SequenceFile(s).
sequenceFile(String, Class<K>, Class<V>) - Static method in class org.apache.crunch.io.At
Creates a TableSourceTarget<K, V> instance from the SequenceFile(s) at the given path name from the key-value pairs in the SequenceFile(s).
sequenceFile(Path, Class<K>, Class<V>) - Static method in class org.apache.crunch.io.At
Creates a TableSourceTarget<K, V> instance from the SequenceFile(s) at the given Path from the key-value pairs in the SequenceFile(s).
sequenceFile(String, PType<K>, PType<V>) - Static method in class org.apache.crunch.io.At
Creates a TableSourceTarget<K, V> instance from the SequenceFile(s) at the given path name from the key-value pairs in the SequenceFile(s).
sequenceFile(Path, PType<K>, PType<V>) - Static method in class org.apache.crunch.io.At
Creates a TableSourceTarget<K, V> instance from the SequenceFile(s) at the given Path from the key-value pairs in the SequenceFile(s).
sequenceFile(String, Class<T>) - Static method in class org.apache.crunch.io.From
Creates a Source<T> instance from the SequenceFile(s) at the given path name from the value field of each key-value pair in the SequenceFile(s).
sequenceFile(Path, Class<T>) - Static method in class org.apache.crunch.io.From
Creates a Source<T> instance from the SequenceFile(s) at the given Path from the value field of each key-value pair in the SequenceFile(s).
sequenceFile(List<Path>, Class<T>) - Static method in class org.apache.crunch.io.From
Creates a Source<T> instance from the SequenceFile(s) at the given Paths from the value field of each key-value pair in the SequenceFile(s).
sequenceFile(String, PType<T>) - Static method in class org.apache.crunch.io.From
Creates a Source<T> instance from the SequenceFile(s) at the given path name from the value field of each key-value pair in the SequenceFile(s).
sequenceFile(Path, PType<T>) - Static method in class org.apache.crunch.io.From
Creates a Source<T> instance from the SequenceFile(s) at the given Path from the value field of each key-value pair in the SequenceFile(s).
sequenceFile(List<Path>, PType<T>) - Static method in class org.apache.crunch.io.From
Creates a Source<T> instance from the SequenceFile(s) at the given Paths from the value field of each key-value pair in the SequenceFile(s).
sequenceFile(String, Class<K>, Class<V>) - Static method in class org.apache.crunch.io.From
Creates a TableSource<K, V> instance for the SequenceFile(s) at the given path name.
sequenceFile(Path, Class<K>, Class<V>) - Static method in class org.apache.crunch.io.From
Creates a TableSource<K, V> instance for the SequenceFile(s) at the given Path.
sequenceFile(List<Path>, Class<K>, Class<V>) - Static method in class org.apache.crunch.io.From
Creates a TableSource<K, V> instance for the SequenceFile(s) at the given Paths.
sequenceFile(String, PType<K>, PType<V>) - Static method in class org.apache.crunch.io.From
Creates a TableSource<K, V> instance for the SequenceFile(s) at the given path name.
sequenceFile(Path, PType<K>, PType<V>) - Static method in class org.apache.crunch.io.From
Creates a TableSource<K, V> instance for the SequenceFile(s) at the given Path.
sequenceFile(List<Path>, PType<K>, PType<V>) - Static method in class org.apache.crunch.io.From
Creates a TableSource<K, V> instance for the SequenceFile(s) at the given Paths.
sequenceFile(String) - Static method in class org.apache.crunch.io.To
Creates a Target at the given path name that writes data to SequenceFiles.
sequenceFile(Path) - Static method in class org.apache.crunch.io.To
Creates a Target at the given Path that writes data to SequenceFiles.
sequentialDo(String, PipelineCallable<Output>) - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
sequentialDo(PipelineCallable<Output>) - Method in class org.apache.crunch.impl.dist.DistributedPipeline
 
sequentialDo(PipelineCallable<Output>) - Method in class org.apache.crunch.impl.mem.MemPipeline
 
sequentialDo(String, PipelineCallable<Output>) - Method in interface org.apache.crunch.PCollection
Adds the materialized data in this PCollection as a dependency to the given PipelineCallable and registers it with the Pipeline associated with this instance.
sequentialDo(PipelineCallable<Output>) - Method in interface org.apache.crunch.Pipeline
Executes the given PipelineCallable on the client after the Targets that the PipelineCallable depends on (if any) have been created by other pipeline processing steps.
SequentialFileNamingScheme - Class in org.apache.crunch.io
Default FileNamingScheme that uses an incrementing sequence number in order to generate unique file names.
SerDe<T> - Interface in org.apache.crunch.impl.spark.serde
 
SerDeFactory - Class in org.apache.crunch.impl.spark.serde
 
SerDeFactory() - Constructor for class org.apache.crunch.impl.spark.serde.SerDeFactory
 
SerializableSupplier<T> - Interface in org.apache.crunch.util
An extension of Guava's Supplier interface that indicates that an instance will also implement Serializable, which makes this object suitable for use with Crunch's DoFns when we need to construct an instance of a non-serializable type for use in processing.
serialize() - Method in class org.apache.crunch.io.FormatBundle
 
set(String, String) - Method in class org.apache.crunch.io.FormatBundle
 
Set - Class in org.apache.crunch.lib
Utilities for performing set operations (difference, intersection, etc) on PCollection instances.
Set() - Constructor for class org.apache.crunch.lib.Set
 
set(int, Writable) - Method in class org.apache.crunch.types.writable.TupleWritable
 
setAge(int) - Method in class org.apache.crunch.test.Person.Builder
Sets the value of the 'age' field
setAge(Integer) - Method in class org.apache.crunch.test.Person
Sets the value of the 'age' field.
setAsOfTime(long) - Method in class org.apache.crunch.kafka.offset.hdfs.Offsets.Builder
Sets the as of time for the collection of offsets.
setBreakpoint() - Method in class org.apache.crunch.impl.dist.collect.BaseUnionCollection
 
setBreakpoint() - Method in class org.apache.crunch.impl.dist.collect.BaseUnionTable
 
setBreakpoint() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
setCombineFn(CombineFn) - Method in class org.apache.crunch.impl.spark.SparkRuntime
 
setConf(Broadcast<byte[]>) - Method in class org.apache.crunch.impl.spark.SparkRuntimeContext
 
setConf(Configuration) - Method in class org.apache.crunch.io.FormatBundle
 
setConf(Configuration) - Method in class org.apache.crunch.kafka.inputformat.KafkaInputFormat
 
setConf(Configuration) - Method in class org.apache.crunch.lib.join.JoinUtils.AvroPairGroupingComparator
 
setConf(Configuration) - Method in class org.apache.crunch.lib.sort.ReverseAvroComparator
 
setConf(Configuration) - Method in class org.apache.crunch.lib.sort.ReverseWritableComparator
 
setConf(Configuration) - Method in class org.apache.crunch.lib.sort.TotalOrderPartitioner
 
setConf(Configuration) - Method in class org.apache.crunch.lib.sort.TupleWritableComparator
 
setConf(Configuration) - Method in class org.apache.crunch.types.writable.TupleWritable.Comparator
 
setConf(Configuration) - Method in class org.apache.crunch.types.writable.TupleWritable
 
setConf(Configuration) - Method in class org.apache.crunch.util.CrunchTool
 
setConfiguration(Configuration) - Method in class org.apache.crunch.DoFn
Called during the setup of an initialized PType that relies on this instance.
setConfiguration(Configuration) - Method in class org.apache.crunch.fn.CompositeMapFn
 
setConfiguration(Configuration) - Method in class org.apache.crunch.fn.ExtractKeyFn
 
setConfiguration(Configuration) - Method in class org.apache.crunch.fn.PairMapFn
 
setConfiguration(Configuration) - Method in class org.apache.crunch.impl.dist.DistributedPipeline
 
setConfiguration(Configuration) - Method in class org.apache.crunch.impl.mem.MemPipeline
 
setConfiguration(Configuration) - Method in interface org.apache.crunch.Pipeline
Set the Configuration to use with this pipeline.
setContext(TaskInputOutputContext<?, ?, ?, ?>) - Method in class org.apache.crunch.DoFn
Called during setup to pass the TaskInputOutputContext to this DoFn instance.
setContext(TaskInputOutputContext<?, ?, ?, ?>) - Method in class org.apache.crunch.fn.CompositeMapFn
 
setContext(TaskInputOutputContext<?, ?, ?, ?>) - Method in class org.apache.crunch.fn.ExtractKeyFn
 
setContext(TaskInputOutputContext<?, ?, ?, ?>) - Method in class org.apache.crunch.fn.PairMapFn
 
setContext(TaskInputOutputContext<?, ?, ?, ?>) - Method in class org.apache.crunch.types.PGroupedTableType.PairIterableMapFn
 
setDepartment(CharSequence) - Method in class org.apache.crunch.test.Employee.Builder
Sets the value of the 'department' field
setDepartment(CharSequence) - Method in class org.apache.crunch.test.Employee
Sets the value of the 'department' field.
setMessage(String) - Method in class org.apache.crunch.PipelineCallable
Sets a message associated with this callable's execution, especially in case of errors.
setName(CharSequence) - Method in class org.apache.crunch.test.Employee.Builder
Sets the value of the 'name' field
setName(CharSequence) - Method in class org.apache.crunch.test.Employee
Sets the value of the 'name' field.
setName(CharSequence) - Method in class org.apache.crunch.test.Person.Builder
Sets the value of the 'name' field
setName(CharSequence) - Method in class org.apache.crunch.test.Person
Sets the value of the 'name' field.
setOffset(long) - Method in class org.apache.crunch.kafka.offset.hdfs.Offsets.PartitionOffset.Builder
Set the offset for the partition offset being built.
setOffsets(List<Offsets.PartitionOffset>) - Method in class org.apache.crunch.kafka.offset.hdfs.Offsets.Builder
Sets the collection of offsets.
setPartition(int) - Method in class org.apache.crunch.kafka.offset.hdfs.Offsets.PartitionOffset.Builder
Set the partition for the partition offset being built
setPartitionFile(Configuration, Path) - Static method in class org.apache.crunch.lib.sort.TotalOrderPartitioner
 
setSalary(int) - Method in class org.apache.crunch.test.Employee.Builder
Sets the value of the 'salary' field
setSalary(Integer) - Method in class org.apache.crunch.test.Employee
Sets the value of the 'salary' field.
setSiblingnames(List<CharSequence>) - Method in class org.apache.crunch.test.Person.Builder
Sets the value of the 'siblingnames' field
setSiblingnames(List<CharSequence>) - Method in class org.apache.crunch.test.Person
Sets the value of the 'siblingnames' field.
setSpecificClassLoader(ClassLoader) - Static method in class org.apache.crunch.types.avro.AvroMode
Set the ClassLoader that will be used for loading Avro org.apache.avro.specific.SpecificRecord and reflection implementation classes.
setTopic(String) - Method in class org.apache.crunch.kafka.offset.hdfs.Offsets.PartitionOffset.Builder
Set the topic for the partition offset being built
setValue(long) - Method in class org.apache.hadoop.mapred.SparkCounter
 
SFlatMapFunction<T,R> - Class in org.apache.crunch.fn
A Crunch-compatible abstract base class for Spark's FlatMapFunction.
SFlatMapFunction() - Constructor for class org.apache.crunch.fn.SFlatMapFunction
 
SFlatMapFunction2<K,V,R> - Class in org.apache.crunch.fn
A Crunch-compatible abstract base class for Spark's FlatMapFunction2.
SFlatMapFunction2() - Constructor for class org.apache.crunch.fn.SFlatMapFunction2
 
SFunction<T,R> - Class in org.apache.crunch.fn
A Crunch-compatible abstract base class for Spark's Function.
SFunction() - Constructor for class org.apache.crunch.fn.SFunction
 
SFunction<S,T> - Interface in org.apache.crunch.lambda.fn
Serializable version of the Java Function functional interface.
SFunction2<K,V,R> - Class in org.apache.crunch.fn
A Crunch-compatible abstract base class for Spark's Function2.
SFunction2() - Constructor for class org.apache.crunch.fn.SFunction2
 
SFunctions - Class in org.apache.crunch.fn
Utility methods for wrapping existing Spark Java API Functions for Crunch compatibility.
Shard - Class in org.apache.crunch.lib
Utilities for controlling how the data in a PCollection is balanced across reducers and output files.
Shard() - Constructor for class org.apache.crunch.lib.Shard
 
shard(PCollection<T>, int) - Static method in class org.apache.crunch.lib.Shard
Creates a PCollection<T> that has the same contents as its input argument but will be written to a fixed number of output files.
ShardedJoinStrategy<K,U,V> - Class in org.apache.crunch.lib.join
JoinStrategy that splits the key space up into shards.
ShardedJoinStrategy(int) - Constructor for class org.apache.crunch.lib.join.ShardedJoinStrategy
Instantiate with a constant number of shards to use for all keys.
ShardedJoinStrategy(int, int) - Constructor for class org.apache.crunch.lib.join.ShardedJoinStrategy
Instantiate with a constant number of shards to use for all keys.
ShardedJoinStrategy(ShardedJoinStrategy.ShardingStrategy<K>) - Constructor for class org.apache.crunch.lib.join.ShardedJoinStrategy
Instantiate with a custom sharding strategy.
ShardedJoinStrategy(ShardedJoinStrategy.ShardingStrategy<K>, int) - Constructor for class org.apache.crunch.lib.join.ShardedJoinStrategy
Instantiate with a custom sharding strategy and a specified number of reducers.
ShardedJoinStrategy.ShardingStrategy<K> - Interface in org.apache.crunch.lib.join
Determines over how many shards a key will be split in a sharded join.
siblingnames - Variable in class org.apache.crunch.test.Person
Deprecated.
SimpleAggregator() - Constructor for class org.apache.crunch.fn.Aggregators.SimpleAggregator
 
SingleKeyFn(int) - Constructor for class org.apache.crunch.lib.sort.SortFns.SingleKeyFn
 
SingleUseIterable<T> - Class in org.apache.crunch.impl
Wrapper around a Reducer's input Iterable.
SingleUseIterable(Iterable<T>) - Constructor for class org.apache.crunch.impl.SingleUseIterable
Instantiate around an Iterable that may only be used once.
size() - Method in class org.apache.crunch.Pair
 
size() - Method in interface org.apache.crunch.Tuple
Returns the number of elements in this Tuple.
size() - Method in class org.apache.crunch.Tuple3
 
size() - Method in class org.apache.crunch.Tuple4
 
size() - Method in class org.apache.crunch.TupleN
 
size() - Method in class org.apache.crunch.types.writable.TupleWritable
The number of children in this Tuple.
skip(String) - Method in class org.apache.crunch.contrib.text.TokenizerFactory.Builder
Sets the regular expression that determines which input characters should be ignored by the Scanner that is returned by the constructed TokenizerFactory.
smearHash(int) - Static method in class org.apache.crunch.util.HashUtil
Applies a supplemental hashing function to an integer, increasing variability in lower-order bits.
snappy(T) - Static method in class org.apache.crunch.io.Compress
Configure the given output target to be compressed using Snappy.
Sort - Class in org.apache.crunch.lib
Utilities for sorting PCollection instances.
Sort() - Constructor for class org.apache.crunch.lib.Sort
 
sort(PCollection<T>) - Static method in class org.apache.crunch.lib.Sort
Sorts the PCollection using the natural ordering of its elements in ascending order.
sort(PCollection<T>, Sort.Order) - Static method in class org.apache.crunch.lib.Sort
Sorts the PCollection using the natural order of its elements with the given Order.
sort(PCollection<T>, int, Sort.Order) - Static method in class org.apache.crunch.lib.Sort
Sorts the PCollection using the natural ordering of its elements in the order specified using the given number of reducers.
sort(PTable<K, V>) - Static method in class org.apache.crunch.lib.Sort
Sorts the PTable using the natural ordering of its keys in ascending order.
sort(PTable<K, V>, Sort.Order) - Static method in class org.apache.crunch.lib.Sort
Sorts the PTable using the natural ordering of its keys with the given Order.
sort(PTable<K, V>, int, Sort.Order) - Static method in class org.apache.crunch.lib.Sort
Sorts the PTable using the natural ordering of its keys in the order specified with a client-specified number of reducers.
Sort.ColumnOrder - Class in org.apache.crunch.lib
To sort by column 2 ascending then column 1 descending, you would use: sortPairs(coll, by(2, ASCENDING), by(1, DESCENDING)) Column numbering is 1-based.
Sort.Order - Enum in org.apache.crunch.lib
For signaling the order in which a sort should be done.
sortAndApply(PTable<K, Pair<V1, V2>>, DoFn<Pair<K, Iterable<Pair<V1, V2>>>, T>, PType<T>) - Static method in class org.apache.crunch.lib.SecondarySort
Perform a secondary sort on the given PTable instance and then apply a DoFn to the resulting sorted data to yield an output PCollection<T>.
sortAndApply(PTable<K, Pair<V1, V2>>, DoFn<Pair<K, Iterable<Pair<V1, V2>>>, T>, PType<T>, int) - Static method in class org.apache.crunch.lib.SecondarySort
Perform a secondary sort on the given PTable instance and then apply a DoFn to the resulting sorted data to yield an output PCollection<T>, using the given number of reducers.
sortAndApply(PTable<K, Pair<V1, V2>>, DoFn<Pair<K, Iterable<Pair<V1, V2>>>, Pair<U, V>>, PTableType<U, V>) - Static method in class org.apache.crunch.lib.SecondarySort
Perform a secondary sort on the given PTable instance and then apply a DoFn to the resulting sorted data to yield an output PTable<U, V>.
sortAndApply(PTable<K, Pair<V1, V2>>, DoFn<Pair<K, Iterable<Pair<V1, V2>>>, Pair<U, V>>, PTableType<U, V>, int) - Static method in class org.apache.crunch.lib.SecondarySort
Perform a secondary sort on the given PTable instance and then apply a DoFn to the resulting sorted data to yield an output PTable<U, V>, using the given number of reducers.
sortComparatorClass(Class<? extends RawComparator>) - Method in class org.apache.crunch.GroupingOptions.Builder
 
SortExample - Class in org.apache.crunch.examples
Simple Crunch tool for running sorting examples from the command line.
SortExample() - Constructor for class org.apache.crunch.examples.SortExample
 
SortFns - Class in org.apache.crunch.lib.sort
A set of DoFns that are used by Crunch's Sort library.
SortFns() - Constructor for class org.apache.crunch.lib.sort.SortFns
 
SortFns.AvroGenericFn<V extends Tuple> - Class in org.apache.crunch.lib.sort
Pulls a composite set of keys from an Avro GenericRecord instance.
SortFns.KeyExtraction<V extends Tuple> - Class in org.apache.crunch.lib.sort
Utility class for encapsulating key extraction logic and serialization information about key extraction.
SortFns.SingleKeyFn<V extends Tuple,K> - Class in org.apache.crunch.lib.sort
Extracts a single indexed key from a Tuple instance.
SortFns.TupleKeyFn<V extends Tuple,K extends Tuple> - Class in org.apache.crunch.lib.sort
Extracts a composite key from a Tuple instance.
sortPairs(PCollection<Pair<U, V>>, Sort.ColumnOrder...) - Static method in class org.apache.crunch.lib.Sort
Sorts the PCollection of Pairs using the specified column ordering.
sortQuads(PCollection<Tuple4<V1, V2, V3, V4>>, Sort.ColumnOrder...) - Static method in class org.apache.crunch.lib.Sort
Sorts the PCollection of Tuple4s using the specified column ordering.
sortTriples(PCollection<Tuple3<V1, V2, V3>>, Sort.ColumnOrder...) - Static method in class org.apache.crunch.lib.Sort
Sorts the PCollection of Tuple3s using the specified column ordering.
sortTuples(PCollection<T>, Sort.ColumnOrder...) - Static method in class org.apache.crunch.lib.Sort
Sorts the PCollection of tuples using the specified column ordering.
sortTuples(PCollection<T>, int, Sort.ColumnOrder...) - Static method in class org.apache.crunch.lib.Sort
Sorts the PCollection of TupleNs using the specified column ordering and a client-specified number of reducers.
Source<T> - Interface in org.apache.crunch
A Source represents an input data set that is an input to one or more MapReduce jobs.
sources(Source<?>...) - Method in class org.apache.crunch.ParallelDoOptions.Builder
 
sources(Collection<Source<?>>) - Method in class org.apache.crunch.ParallelDoOptions.Builder
 
sourceTarget(SourceTarget<?>) - Method in class org.apache.crunch.GroupingOptions.Builder
Deprecated.
SourceTarget<T> - Interface in org.apache.crunch
An interface for classes that implement both the Source and the Target interfaces.
SourceTargetHelper - Class in org.apache.crunch.io
Functions for configuring the inputs/outputs of MapReduce jobs.
SourceTargetHelper() - Constructor for class org.apache.crunch.io.SourceTargetHelper
 
sourceTargets(SourceTarget<?>...) - Method in class org.apache.crunch.GroupingOptions.Builder
 
sourceTargets(Collection<SourceTarget<?>>) - Method in class org.apache.crunch.GroupingOptions.Builder
 
sourceTargets(SourceTarget<?>...) - Method in class org.apache.crunch.ParallelDoOptions.Builder
 
sourceTargets(Collection<SourceTarget<?>>) - Method in class org.apache.crunch.ParallelDoOptions.Builder
 
SPairFlatMapFunction<T,K,V> - Class in org.apache.crunch.fn
A Crunch-compatible abstract base class for Spark's PairFlatMapFunction.
SPairFlatMapFunction() - Constructor for class org.apache.crunch.fn.SPairFlatMapFunction
 
SPairFunction<T,K,V> - Class in org.apache.crunch.fn
A Crunch-compatible abstract base class for Spark's PairFunction.
SPairFunction() - Constructor for class org.apache.crunch.fn.SPairFunction
 
SparkCollectFactory - Class in org.apache.crunch.impl.spark.collect
 
SparkCollectFactory() - Constructor for class org.apache.crunch.impl.spark.collect.SparkCollectFactory
 
SparkCollection - Interface in org.apache.crunch.impl.spark
 
SparkComparator - Class in org.apache.crunch.impl.spark
 
SparkComparator(GroupingOptions, PGroupedTableType, SparkRuntimeContext) - Constructor for class org.apache.crunch.impl.spark.SparkComparator
 
SparkCounter - Class in org.apache.hadoop.mapred
 
SparkCounter(String, String, Accumulator<Map<String, Map<String, Long>>>) - Constructor for class org.apache.hadoop.mapred.SparkCounter
 
SparkCounter(String, String, long) - Constructor for class org.apache.hadoop.mapred.SparkCounter
 
SparkPartitioner - Class in org.apache.crunch.impl.spark
 
SparkPartitioner(int) - Constructor for class org.apache.crunch.impl.spark.SparkPartitioner
 
SparkPipeline - Class in org.apache.crunch.impl.spark
 
SparkPipeline(String, String) - Constructor for class org.apache.crunch.impl.spark.SparkPipeline
 
SparkPipeline(String, String, Class<?>) - Constructor for class org.apache.crunch.impl.spark.SparkPipeline
 
SparkPipeline(String, String, Class<?>, Configuration) - Constructor for class org.apache.crunch.impl.spark.SparkPipeline
 
SparkPipeline(JavaSparkContext, String) - Constructor for class org.apache.crunch.impl.spark.SparkPipeline
 
SparkPipeline(JavaSparkContext, String, Class<?>, Configuration) - Constructor for class org.apache.crunch.impl.spark.SparkPipeline
 
SparkRuntime - Class in org.apache.crunch.impl.spark
 
SparkRuntime(SparkPipeline, JavaSparkContext, Configuration, Map<PCollectionImpl<?>, Set<Target>>, Map<PCollectionImpl<?>, MaterializableIterable>, Map<PCollection<?>, StorageLevel>, Map<PipelineCallable<?>, Set<Target>>) - Constructor for class org.apache.crunch.impl.spark.SparkRuntime
 
SparkRuntimeContext - Class in org.apache.crunch.impl.spark
 
SparkRuntimeContext(String, Accumulator<Map<String, Map<String, Long>>>, Broadcast<byte[]>) - Constructor for class org.apache.crunch.impl.spark.SparkRuntimeContext
 
SPECIFIC - Static variable in class org.apache.crunch.types.avro.AvroMode
Default mode to use for reading and writing Specific types.
specifics(Class<T>) - Static method in class org.apache.crunch.types.avro.Avros
 
split(PCollection<Pair<T, U>>) - Static method in class org.apache.crunch.lib.Channels
Splits a PCollection of any Pair of objects into a Pair of PCollection}, to allow for the output of a DoFn to be handled using separate channels.
split(PCollection<Pair<T, U>>, PType<T>, PType<U>) - Static method in class org.apache.crunch.lib.Channels
Splits a PCollection of any Pair of objects into a Pair of PCollection}, to allow for the output of a DoFn to be handled using separate channels.
SPredicate<T> - Interface in org.apache.crunch.lambda.fn
Serializable version of the Java Predicate functional interface.
SSupplier<T> - Interface in org.apache.crunch.lambda.fn
Serializable version of the Java Supplier functional interface.
StageResult(String, Counters) - Constructor for class org.apache.crunch.PipelineResult.StageResult
 
StageResult(String, Counters, long, long) - Constructor for class org.apache.crunch.PipelineResult.StageResult
 
StageResult(String, String, Counters, long, long, long, long) - Constructor for class org.apache.crunch.PipelineResult.StageResult
 
status - Variable in class org.apache.crunch.PipelineResult
 
STRING_CONCAT(String, boolean) - Static method in class org.apache.crunch.fn.Aggregators
Concatenate strings, with a separator between strings.
STRING_CONCAT(String, boolean, long, long) - Static method in class org.apache.crunch.fn.Aggregators
Concatenate strings, with a separator between strings.
STRING_TO_UTF8 - Static variable in class org.apache.crunch.types.avro.Avros
 
strings() - Static method in class org.apache.crunch.types.avro.Avros
 
strings() - Method in class org.apache.crunch.types.avro.AvroTypeFamily
 
strings() - Method in interface org.apache.crunch.types.PTypeFamily
 
strings() - Static method in class org.apache.crunch.types.writable.Writables
 
strings() - Method in class org.apache.crunch.types.writable.WritableTypeFamily
 
succeeded() - Method in class org.apache.crunch.PipelineResult
 
SUM_BIGDECIMALS() - Static method in class org.apache.crunch.fn.Aggregators
Sum up all BigDecimal values.
SUM_BIGINTS() - Static method in class org.apache.crunch.fn.Aggregators
Sum up all BigInteger values.
SUM_DOUBLES() - Static method in class org.apache.crunch.fn.Aggregators
Sum up all double values.
SUM_FLOATS() - Static method in class org.apache.crunch.fn.Aggregators
Sum up all float values.
SUM_INTS() - Static method in class org.apache.crunch.fn.Aggregators
Sum up all int values.
SUM_LONGS() - Static method in class org.apache.crunch.fn.Aggregators
Sum up all long values.
SwapFn<V1,V2> - Class in org.apache.crunch.fn
Swap the elements of a Pair type.
SwapFn() - Constructor for class org.apache.crunch.fn.SwapFn
 
swapKeyValue(PTable<K, V>) - Static method in class org.apache.crunch.lib.PTables
Swap the key and value part of a table.

T

tableOf(S, T, Object...) - Static method in class org.apache.crunch.impl.mem.MemPipeline
 
tableOf(Iterable<Pair<S, T>>) - Static method in class org.apache.crunch.impl.mem.MemPipeline
 
tableOf(PType<K>, PType<V>) - Static method in class org.apache.crunch.types.avro.Avros
A table type with an Avro type as key and as value.
tableOf(PType<K>, PType<V>) - Method in class org.apache.crunch.types.avro.AvroTypeFamily
 
tableOf(PType<K>, PType<V>) - Method in interface org.apache.crunch.types.PTypeFamily
 
tableOf(PType<K>, PType<V>) - Static method in class org.apache.crunch.types.writable.Writables
 
tableOf(PType<K>, PType<V>) - Method in class org.apache.crunch.types.writable.WritableTypeFamily
 
TableSource<K,V> - Interface in org.apache.crunch
The interface Source implementations that return a PTable.
TableSourceTarget<K,V> - Interface in org.apache.crunch
An interface for classes that implement both the TableSource and the Target interfaces.
tableType(PTableType<K, V>) - Static method in class org.apache.crunch.fn.SwapFn
 
tagExistingKafkaConnectionProperties(Properties) - Static method in class org.apache.crunch.kafka.inputformat.KafkaInputFormat
Generates a Properties object containing the properties in connectionProperties, but with every property prefixed with "org.apache.crunch.kafka.connection.properties".
Target - Interface in org.apache.crunch
A Target represents the output destination of a Crunch PCollection in the context of a Crunch job.
Target.WriteMode - Enum in org.apache.crunch
An enum to represent different options the client may specify for handling the case where the output path, table, etc.
targets(Target...) - Method in class org.apache.crunch.ParallelDoOptions.Builder
 
targets(Collection<Target>) - Method in class org.apache.crunch.ParallelDoOptions.Builder
 
tempDir - Variable in class org.apache.crunch.test.CrunchTestSupport
 
TemporaryPath - Class in org.apache.crunch.test
Creates a temporary directory for a test case and destroys it afterwards.
TemporaryPath(String...) - Constructor for class org.apache.crunch.test.TemporaryPath
Construct TemporaryPath.
TestCounters - Class in org.apache.crunch.test
A utility class used during unit testing to update and read counters.
TestCounters() - Constructor for class org.apache.crunch.test.TestCounters
 
textFile(String) - Static method in class org.apache.crunch.io.At
Creates a SourceTarget<String> instance for the text file(s) at the given path name.
textFile(Path) - Static method in class org.apache.crunch.io.At
Creates a SourceTarget<String> instance for the text file(s) at the given Path.
textFile(String, PType<T>) - Static method in class org.apache.crunch.io.At
Creates a SourceTarget<T> instance for the text file(s) at the given path name using the provided PType<T> to convert the input text.
textFile(Path, PType<T>) - Static method in class org.apache.crunch.io.At
Creates a SourceTarget<T> instance for the text file(s) at the given Path using the provided PType<T> to convert the input text.
textFile(String) - Static method in class org.apache.crunch.io.From
Creates a Source<String> instance for the text file(s) at the given path name.
textFile(Path) - Static method in class org.apache.crunch.io.From
Creates a Source<String> instance for the text file(s) at the given Path.
textFile(List<Path>) - Static method in class org.apache.crunch.io.From
Creates a Source<String> instance for the text file(s) at the given Paths.
textFile(String, PType<T>) - Static method in class org.apache.crunch.io.From
Creates a Source<T> instance for the text file(s) at the given path name using the provided PType<T> to convert the input text.
textFile(Path, PType<T>) - Static method in class org.apache.crunch.io.From
Creates a Source<T> instance for the text file(s) at the given Path using the provided PType<T> to convert the input text.
textFile(List<Path>, PType<T>) - Static method in class org.apache.crunch.io.From
Creates a Source<T> instance for the text file(s) at the given Paths using the provided PType<T> to convert the input text.
textFile(String) - Static method in class org.apache.crunch.io.To
Creates a Target at the given path name that writes data to text files.
textFile(Path) - Static method in class org.apache.crunch.io.To
Creates a Target at the given Path that writes data to text files.
third() - Method in class org.apache.crunch.Tuple3
 
third() - Method in class org.apache.crunch.Tuple4
 
thrifts(Class<T>, PTypeFamily) - Static method in class org.apache.crunch.types.PTypes
Constructs a PType for a Thrift record.
To - Class in org.apache.crunch.io
Static factory methods for creating common Target types.
To() - Constructor for class org.apache.crunch.io.To
 
ToByteArrayFunction - Class in org.apache.crunch.impl.spark.collect
 
ToByteArrayFunction() - Constructor for class org.apache.crunch.impl.spark.collect.ToByteArrayFunction
 
toBytes(T) - Method in class org.apache.crunch.impl.spark.serde.AvroSerDe
 
toBytes(T) - Method in interface org.apache.crunch.impl.spark.serde.SerDe
 
toBytes(Writable) - Method in class org.apache.crunch.impl.spark.serde.WritableSerDe
 
toCombineFn(Aggregator<V>) - Static method in class org.apache.crunch.fn.Aggregators
Deprecated.
toCombineFn(Aggregator<V>, PType<V>) - Static method in class org.apache.crunch.fn.Aggregators
Wrap a CombineFn adapter around the given aggregator.
Tokenizer - Class in org.apache.crunch.contrib.text
Manages a Scanner instance and provides support for returning only a subset of the fields returned by the underlying Scanner.
Tokenizer(Scanner, Set<Integer>, boolean) - Constructor for class org.apache.crunch.contrib.text.Tokenizer
Create a new Tokenizer instance.
TokenizerFactory - Class in org.apache.crunch.contrib.text
Factory class that constructs Tokenizer instances for input strings that use a fixed set of delimiters, skip patterns, locales, and sets of indices to keep or drop.
TokenizerFactory.Builder - Class in org.apache.crunch.contrib.text
A class for constructing new TokenizerFactory instances using the Builder pattern.
top(int) - Method in class org.apache.crunch.impl.dist.collect.PTableBase
 
top(PTable<K, V>, int, boolean) - Static method in class org.apache.crunch.lib.Aggregate
Selects the top N pairs from the given table, with sorting being performed on the values (i.e.
top(int) - Method in interface org.apache.crunch.PTable
Returns a PTable made up of the pairs in this PTable with the largest value field.
TopKCombineFn(int, boolean, PType<Pair<K, V>>) - Constructor for class org.apache.crunch.lib.Aggregate.TopKCombineFn
 
TopKFn(int, boolean, PType<Pair<K, V>>) - Constructor for class org.apache.crunch.lib.Aggregate.TopKFn
 
TopList - Class in org.apache.crunch.lib
Tools for creating top lists of items in PTables and PCollections
TopList() - Constructor for class org.apache.crunch.lib.TopList
 
topNYbyX(PTable<X, Y>, int) - Static method in class org.apache.crunch.lib.TopList
Create a top-list of elements in the provided PTable, categorised by the key of the input table and using the count of the value part of the input table.
toString() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
toString() - Method in class org.apache.crunch.kafka.inputformat.KafkaInputSplit
 
toString() - Method in class org.apache.crunch.kafka.KafkaSource
 
toString() - Method in class org.apache.crunch.lib.Sort.ColumnOrder
 
toString() - Method in class org.apache.crunch.Pair
 
toString() - Method in class org.apache.crunch.Tuple3
 
toString() - Method in class org.apache.crunch.Tuple4
 
toString() - Method in class org.apache.crunch.TupleN
 
toString() - Method in class org.apache.crunch.types.writable.TupleWritable
Convert Tuple to String as in the following.
TotalBytesByIP - Class in org.apache.crunch.examples
 
TotalBytesByIP() - Constructor for class org.apache.crunch.examples.TotalBytesByIP
 
TotalOrderPartitioner<K,V> - Class in org.apache.crunch.lib.sort
A partition-aware Partitioner instance that can work with either Avro or Writable-formatted keys.
TotalOrderPartitioner() - Constructor for class org.apache.crunch.lib.sort.TotalOrderPartitioner
 
TotalOrderPartitioner.BinarySearchNode<K> - Class in org.apache.crunch.lib.sort
 
TotalOrderPartitioner.Node<T> - Interface in org.apache.crunch.lib.sort
Interface to the partitioner to locate a key in the partition keyset.
TotalWordCount - Class in org.apache.crunch.examples
 
TotalWordCount() - Constructor for class org.apache.crunch.examples.TotalWordCount
 
tripAggregator(Aggregator<V1>, Aggregator<V2>, Aggregator<V3>) - Static method in class org.apache.crunch.fn.Aggregators
Apply separate aggregators to each component of a Tuple3.
TripIterable(Iterable<A>, Iterable<B>, Iterable<C>) - Constructor for class org.apache.crunch.util.Tuples.TripIterable
 
triples(PType<V1>, PType<V2>, PType<V3>) - Static method in class org.apache.crunch.types.avro.Avros
 
triples(PType<V1>, PType<V2>, PType<V3>) - Method in class org.apache.crunch.types.avro.AvroTypeFamily
 
triples(PType<V1>, PType<V2>, PType<V3>) - Method in interface org.apache.crunch.types.PTypeFamily
 
triples(PType<V1>, PType<V2>, PType<V3>) - Static method in class org.apache.crunch.types.writable.Writables
 
triples(PType<V1>, PType<V2>, PType<V3>) - Method in class org.apache.crunch.types.writable.WritableTypeFamily
 
Tuple - Interface in org.apache.crunch
A fixed-size collection of Objects, used in Crunch for representing joins between PCollections.
Tuple2MapFunction<K,V> - Class in org.apache.crunch.impl.spark.fn
 
Tuple2MapFunction(MapFn<Pair<K, V>, Pair<K, V>>, SparkRuntimeContext) - Constructor for class org.apache.crunch.impl.spark.fn.Tuple2MapFunction
 
tuple2PairFunc() - Static method in class org.apache.crunch.impl.spark.GuavaUtils
 
Tuple3<V1,V2,V3> - Class in org.apache.crunch
A convenience class for three-element Tuples.
Tuple3(V1, V2, V3) - Constructor for class org.apache.crunch.Tuple3
 
TUPLE3 - Static variable in class org.apache.crunch.types.TupleFactory
 
Tuple3.Collect<V1,V2,V3> - Class in org.apache.crunch
 
Tuple4<V1,V2,V3,V4> - Class in org.apache.crunch
A convenience class for four-element Tuples.
Tuple4(V1, V2, V3, V4) - Constructor for class org.apache.crunch.Tuple4
 
TUPLE4 - Static variable in class org.apache.crunch.types.TupleFactory
 
Tuple4.Collect<V1,V2,V3,V4> - Class in org.apache.crunch
 
tupleAggregator(Aggregator<?>...) - Static method in class org.apache.crunch.fn.Aggregators
Apply separate aggregators to each component of a Tuple.
TupleDeepCopier<T extends Tuple> - Class in org.apache.crunch.types
Performs deep copies (based on underlying PType deep copying) of Tuple-based objects.
TupleDeepCopier(Class<T>, PType...) - Constructor for class org.apache.crunch.types.TupleDeepCopier
 
TupleFactory<T extends Tuple> - Class in org.apache.crunch.types
 
TupleFactory() - Constructor for class org.apache.crunch.types.TupleFactory
 
TupleKeyFn(int[], TupleFactory) - Constructor for class org.apache.crunch.lib.sort.SortFns.TupleKeyFn
 
TupleN - Class in org.apache.crunch
A Tuple instance for an arbitrary number of values.
TupleN(Object...) - Constructor for class org.apache.crunch.TupleN
 
TUPLEN - Static variable in class org.apache.crunch.types.TupleFactory
 
TupleNIterable(Iterable<?>...) - Constructor for class org.apache.crunch.util.Tuples.TupleNIterable
 
TupleObjectInspector<T extends Tuple> - Class in org.apache.crunch.types.orc
An object inspector to define the structure of Crunch Tuples
TupleObjectInspector(TupleFactory<T>, PType...) - Constructor for class org.apache.crunch.types.orc.TupleObjectInspector
 
tuples(PType...) - Static method in class org.apache.crunch.types.avro.Avros
 
tuples(Class<T>, PType...) - Static method in class org.apache.crunch.types.avro.Avros
 
tuples(PType<?>...) - Method in class org.apache.crunch.types.avro.AvroTypeFamily
 
tuples(Class<T>, PType<?>...) - Method in class org.apache.crunch.types.avro.AvroTypeFamily
 
tuples(PType...) - Static method in class org.apache.crunch.types.orc.Orcs
Create a tuple-based PType.
tuples(PType<?>...) - Method in interface org.apache.crunch.types.PTypeFamily
 
tuples(Class<T>, PType<?>...) - Method in interface org.apache.crunch.types.PTypeFamily
 
tuples(PType...) - Static method in class org.apache.crunch.types.writable.Writables
 
tuples(Class<T>, PType...) - Static method in class org.apache.crunch.types.writable.Writables
 
tuples(PType<?>...) - Method in class org.apache.crunch.types.writable.WritableTypeFamily
 
tuples(Class<T>, PType<?>...) - Method in class org.apache.crunch.types.writable.WritableTypeFamily
 
Tuples - Class in org.apache.crunch.util
Utilities for working with subclasses of the Tuple interface.
Tuples() - Constructor for class org.apache.crunch.util.Tuples
 
Tuples.PairIterable<S,T> - Class in org.apache.crunch.util
 
Tuples.QuadIterable<A,B,C,D> - Class in org.apache.crunch.util
 
Tuples.TripIterable<A,B,C> - Class in org.apache.crunch.util
 
Tuples.TupleNIterable - Class in org.apache.crunch.util
 
TupleWritable - Class in org.apache.crunch.types.writable
A serialization format for Tuple.
TupleWritable() - Constructor for class org.apache.crunch.types.writable.TupleWritable
Create an empty tuple with no allocated storage for writables.
TupleWritable(Writable[]) - Constructor for class org.apache.crunch.types.writable.TupleWritable
 
TupleWritable(Writable[], int[]) - Constructor for class org.apache.crunch.types.writable.TupleWritable
Initialize tuple with storage; unknown whether any of them contain "written" values.
TupleWritable.Comparator - Class in org.apache.crunch.types.writable
 
TupleWritableComparator() - Constructor for class org.apache.crunch.lib.join.JoinUtils.TupleWritableComparator
 
TupleWritableComparator - Class in org.apache.crunch.lib.sort
 
TupleWritableComparator() - Constructor for class org.apache.crunch.lib.sort.TupleWritableComparator
 
TupleWritablePartitioner() - Constructor for class org.apache.crunch.lib.join.JoinUtils.TupleWritablePartitioner
 
typedCollectionOf(PType<T>, T...) - Static method in class org.apache.crunch.impl.mem.MemPipeline
 
typedCollectionOf(PType<T>, Iterable<T>) - Static method in class org.apache.crunch.impl.mem.MemPipeline
 
typedTableOf(PTableType<S, T>, S, T, Object...) - Static method in class org.apache.crunch.impl.mem.MemPipeline
 
typedTableOf(PTableType<S, T>, Iterable<Pair<S, T>>) - Static method in class org.apache.crunch.impl.mem.MemPipeline
 

U

underlying() - Method in interface org.apache.crunch.lambda.LCollection
Get the underlying PCollection for this LCollection
underlying() - Method in interface org.apache.crunch.lambda.LGroupedTable
Get the underlying PGroupedTable for this LGroupedTable
underlying() - Method in interface org.apache.crunch.lambda.LTable
Get the underlying PTable for this LCollection
ungroup() - Method in class org.apache.crunch.impl.dist.collect.BaseGroupedTable
 
ungroup() - Method in interface org.apache.crunch.lambda.LGroupedTable
Ungroup this LGroupedTable back into an LTable.
ungroup() - Method in interface org.apache.crunch.PGroupedTable
Convert this grouping back into a multimap.
union(PCollection<S>) - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
union(PCollection<S>...) - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
union(PTable<K, V>) - Method in class org.apache.crunch.impl.dist.collect.PTableBase
 
union(PTable<K, V>...) - Method in class org.apache.crunch.impl.dist.collect.PTableBase
 
union(List<PCollection<S>>) - Method in class org.apache.crunch.impl.dist.DistributedPipeline
 
union(List<PCollection<S>>) - Method in class org.apache.crunch.impl.mem.MemPipeline
 
union(LCollection<S>) - Method in interface org.apache.crunch.lambda.LCollection
Union this LCollection with another LCollection of the same type
union(PCollection<S>) - Method in interface org.apache.crunch.lambda.LCollection
Union this LCollection with a PCollection of the same type
union(LTable<K, V>) - Method in interface org.apache.crunch.lambda.LTable
{@inheritDoc
union(PTable<K, V>) - Method in interface org.apache.crunch.lambda.LTable
{@inheritDoc
union(PCollection<S>) - Method in interface org.apache.crunch.PCollection
Returns a PCollection instance that acts as the union of this PCollection and the given PCollection.
union(PCollection<S>...) - Method in interface org.apache.crunch.PCollection
Returns a PCollection instance that acts as the union of this PCollection and the input PCollections.
union(List<PCollection<S>>) - Method in interface org.apache.crunch.Pipeline
 
union(PTable<K, V>) - Method in interface org.apache.crunch.PTable
Returns a PTable instance that acts as the union of this PTable and the other PTables.
union(PTable<K, V>...) - Method in interface org.apache.crunch.PTable
Returns a PTable instance that acts as the union of this PTable and the input PTables.
Union - Class in org.apache.crunch
Allows us to represent the combination of multiple data sources that may contain different types of data as a single type with an index to indicate which of the original sources the current record was from.
Union(int, Object) - Constructor for class org.apache.crunch.Union
 
UnionCollection<S> - Class in org.apache.crunch.impl.spark.collect
 
UnionDeepCopier - Class in org.apache.crunch.types
 
UnionDeepCopier(PType...) - Constructor for class org.apache.crunch.types.UnionDeepCopier
 
unionOf(PType<?>...) - Static method in class org.apache.crunch.types.avro.Avros
 
unionOf(PType<?>...) - Method in class org.apache.crunch.types.avro.AvroTypeFamily
 
unionOf(PType<?>...) - Method in interface org.apache.crunch.types.PTypeFamily
 
unionOf(PType<?>...) - Static method in class org.apache.crunch.types.writable.Writables
 
unionOf(PType<?>...) - Method in class org.apache.crunch.types.writable.WritableTypeFamily
 
UnionReadableData<T> - Class in org.apache.crunch.util
 
UnionReadableData(List<ReadableData<T>>) - Constructor for class org.apache.crunch.util.UnionReadableData
 
UnionTable<K,V> - Class in org.apache.crunch.impl.spark.collect
 
unionTables(List<PTable<K, V>>) - Method in class org.apache.crunch.impl.dist.DistributedPipeline
 
unionTables(List<PTable<K, V>>) - Method in class org.apache.crunch.impl.mem.MemPipeline
 
unionTables(List<PTable<K, V>>) - Method in interface org.apache.crunch.Pipeline
 
UnionWritable - Class in org.apache.crunch.types.writable
 
UnionWritable() - Constructor for class org.apache.crunch.types.writable.UnionWritable
 
UnionWritable(int, BytesWritable) - Constructor for class org.apache.crunch.types.writable.UnionWritable
 
UNIQUE_ELEMENTS() - Static method in class org.apache.crunch.fn.Aggregators
Collect the unique elements of the input, as defined by the equals method for the input objects.
update(T) - Method in interface org.apache.crunch.Aggregator
Incorporate the given value into the aggregate state maintained by this instance.
update(V) - Method in class org.apache.crunch.lambda.LAggregator
 
useDisk(boolean) - Method in class org.apache.crunch.CachingOptions.Builder
 
useDisk() - Method in class org.apache.crunch.CachingOptions
Whether the framework may cache data on disk.
useMemory(boolean) - Method in class org.apache.crunch.CachingOptions.Builder
 
useMemory() - Method in class org.apache.crunch.CachingOptions
Whether the framework may cache data in memory without writing it to disk.
UTF8_TO_STRING - Static variable in class org.apache.crunch.types.avro.Avros
 
uuid(PTypeFamily) - Static method in class org.apache.crunch.types.PTypes
A PType for Java's UUID type.

V

value - Variable in class org.apache.crunch.impl.spark.ByteArray
 
valueClass - Variable in class org.apache.crunch.io.CrunchOutputs.OutputConfig
 
valueOf(String) - Static method in enum org.apache.crunch.impl.mr.MRJob.State
Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum org.apache.crunch.lib.join.JoinType
Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum org.apache.crunch.lib.Sort.Order
Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum org.apache.crunch.PipelineCallable.Status
Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum org.apache.crunch.PipelineExecution.Status
Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum org.apache.crunch.Target.WriteMode
Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum org.apache.crunch.types.avro.AvroMode.ModeType
Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum org.apache.crunch.types.avro.AvroType.AvroRecordType
Returns the enum constant of this type with the specified name.
values() - Method in class org.apache.crunch.impl.dist.collect.PTableBase
 
values() - Static method in enum org.apache.crunch.impl.mr.MRJob.State
Returns an array containing the constants of this enum type, in the order they are declared.
values() - Method in interface org.apache.crunch.lambda.LTable
Get an LCollection containing just the values from this table
values() - Static method in enum org.apache.crunch.lib.join.JoinType
Returns an array containing the constants of this enum type, in the order they are declared.
values(PTable<K, V>) - Static method in class org.apache.crunch.lib.PTables
Extract the values from the given PTable<K, V> as a PCollection<V>.
values() - Static method in enum org.apache.crunch.lib.Sort.Order
Returns an array containing the constants of this enum type, in the order they are declared.
values() - Static method in enum org.apache.crunch.PipelineCallable.Status
Returns an array containing the constants of this enum type, in the order they are declared.
values() - Static method in enum org.apache.crunch.PipelineExecution.Status
Returns an array containing the constants of this enum type, in the order they are declared.
values() - Method in interface org.apache.crunch.PTable
Returns a PCollection made up of the values in this PTable.
values() - Static method in enum org.apache.crunch.Target.WriteMode
Returns an array containing the constants of this enum type, in the order they are declared.
values() - Static method in enum org.apache.crunch.types.avro.AvroMode.ModeType
Returns an array containing the constants of this enum type, in the order they are declared.
values() - Static method in enum org.apache.crunch.types.avro.AvroType.AvroRecordType
Returns an array containing the constants of this enum type, in the order they are declared.
valueType() - Method in interface org.apache.crunch.lambda.LGroupedTable
Get a PType which can be used to serialize the value part of this grouped table
valueType() - Method in interface org.apache.crunch.lambda.LTable
Get a PType which can be used to serialize the value part of this table
visitDoCollection(BaseDoCollection<?>) - Method in interface org.apache.crunch.impl.dist.collect.PCollectionImpl.Visitor
 
visitDoTable(BaseDoTable<?, ?>) - Method in interface org.apache.crunch.impl.dist.collect.PCollectionImpl.Visitor
 
visitGroupedTable(BaseGroupedTable<?, ?>) - Method in interface org.apache.crunch.impl.dist.collect.PCollectionImpl.Visitor
 
visitInputCollection(BaseInputCollection<?>) - Method in interface org.apache.crunch.impl.dist.collect.PCollectionImpl.Visitor
 
visitUnionCollection(BaseUnionCollection<?>) - Method in interface org.apache.crunch.impl.dist.collect.PCollectionImpl.Visitor
 

W

waitFor(long, TimeUnit) - Method in class org.apache.crunch.impl.spark.SparkRuntime
 
waitFor(long, TimeUnit) - Method in interface org.apache.crunch.PipelineExecution
Blocks until pipeline completes or the specified waiting time elapsed.
waitUntilDone() - Method in class org.apache.crunch.impl.spark.SparkRuntime
 
waitUntilDone() - Method in interface org.apache.crunch.PipelineExecution
Blocks until pipeline completes, i.e.
wasLogged() - Method in exception org.apache.crunch.CrunchRuntimeException
Returns true if this exception was written to the debug logs.
weightedReservoirSample(PCollection<Pair<T, N>>, int) - Static method in class org.apache.crunch.lib.Sample
Selects a weighted sample of the elements of the given PCollection, where the second term in the input Pair is a numerical weight.
weightedReservoirSample(PCollection<Pair<T, N>>, int, Long) - Static method in class org.apache.crunch.lib.Sample
The weighted reservoir sampling function with the seed term exposed for testing purposes.
withFactory(ReaderWriterFactory) - Method in class org.apache.crunch.types.avro.AvroMode
Creates a new AvroMode instance which will utilize the factory instance for creating Avro readers and writers.
withFactoryFromConfiguration(Configuration) - Method in class org.apache.crunch.types.avro.AvroMode
 
WordAggregationHBase - Class in org.apache.crunch.examples
You need to have a HBase instance running.
WordAggregationHBase() - Constructor for class org.apache.crunch.examples.WordAggregationHBase
 
WordCount - Class in org.apache.crunch.examples
 
WordCount() - Constructor for class org.apache.crunch.examples.WordCount
 
wrap(Function<T, R>) - Static method in class org.apache.crunch.fn.SFunctions
 
wrap(Function2<K, V, R>) - Static method in class org.apache.crunch.fn.SFunctions
 
wrap(PairFunction<T, K, V>) - Static method in class org.apache.crunch.fn.SFunctions
 
wrap(FlatMapFunction<T, R>) - Static method in class org.apache.crunch.fn.SFunctions
 
wrap(FlatMapFunction2<K, V, R>) - Static method in class org.apache.crunch.fn.SFunctions
 
wrap(DoubleFunction<T>) - Static method in class org.apache.crunch.fn.SFunctions
 
wrap(DoubleFlatMapFunction<T>) - Static method in class org.apache.crunch.fn.SFunctions
 
wrap(PCollection<S>) - Static method in class org.apache.crunch.lambda.Lambda
 
wrap(PTable<K, V>) - Static method in class org.apache.crunch.lambda.Lambda
 
wrap(PGroupedTable<K, V>) - Static method in class org.apache.crunch.lambda.Lambda
 
wrap(PCollection<S>) - Method in interface org.apache.crunch.lambda.LCollectionFactory
Wrap a PCollection into an LCollection
wrap(PTable<K, V>) - Method in interface org.apache.crunch.lambda.LCollectionFactory
Wrap a PTable into an LTable
wrap(PGroupedTable<K, V>) - Method in interface org.apache.crunch.lambda.LCollectionFactory
Wrap a PGroupedTable into an LGroupedTable
WritableDeepCopier<T extends org.apache.hadoop.io.Writable> - Class in org.apache.crunch.types.writable
Performs deep copies of Writable values.
WritableDeepCopier(Class<T>) - Constructor for class org.apache.crunch.types.writable.WritableDeepCopier
 
WRITABLES - Static variable in class org.apache.crunch.impl.spark.ByteArrayHelper
 
writables(Class<T>) - Static method in class org.apache.crunch.types.avro.Avros
 
Writables - Class in org.apache.crunch.types.writable
Defines static methods that are analogous to the methods defined in WritableTypeFamily for convenient static importing.
writables(Class<W>) - Static method in class org.apache.crunch.types.writable.Writables
 
writables(Class<W>) - Method in class org.apache.crunch.types.writable.WritableTypeFamily
 
WritableSerDe - Class in org.apache.crunch.impl.spark.serde
 
WritableSerDe(Class<? extends Writable>) - Constructor for class org.apache.crunch.impl.spark.serde.WritableSerDe
 
WritableType<T,W extends org.apache.hadoop.io.Writable> - Class in org.apache.crunch.types.writable
 
WritableType(Class<T>, Class<W>, MapFn<W, T>, MapFn<T, W>, PType...) - Constructor for class org.apache.crunch.types.writable.WritableType
 
WritableTypeFamily - Class in org.apache.crunch.types.writable
The Writable-based implementation of the PTypeFamily interface.
write(DataOutput) - Method in class org.apache.crunch.contrib.io.jdbc.IdentifiableName
 
write(PreparedStatement) - Method in class org.apache.crunch.contrib.io.jdbc.IdentifiableName
 
write(Target) - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
write(Target, Target.WriteMode) - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
write(Target) - Method in class org.apache.crunch.impl.dist.collect.PTableBase
 
write(Target, Target.WriteMode) - Method in class org.apache.crunch.impl.dist.collect.PTableBase
 
write(PCollection<?>, Target) - Method in class org.apache.crunch.impl.dist.DistributedPipeline
 
write(PCollection<?>, Target, Target.WriteMode) - Method in class org.apache.crunch.impl.dist.DistributedPipeline
 
write(PCollection<?>, Target) - Method in class org.apache.crunch.impl.mem.MemPipeline
 
write(PCollection<?>, Target, Target.WriteMode) - Method in class org.apache.crunch.impl.mem.MemPipeline
 
write(String, K, V) - Method in class org.apache.crunch.io.CrunchOutputs
 
write(DataOutput) - Method in class org.apache.crunch.io.FormatBundle
 
write(DataOutput) - Method in class org.apache.crunch.kafka.inputformat.KafkaInputSplit
 
write(Map<TopicPartition, Long>) - Method in class org.apache.crunch.kafka.offset.AbstractOffsetWriter
 
write(long, Map<TopicPartition, Long>) - Method in class org.apache.crunch.kafka.offset.hdfs.HDFSOffsetWriter
 
write(Map<TopicPartition, Long>) - Method in interface org.apache.crunch.kafka.offset.OffsetWriter
Persists the offsets to a configured location with the current time specified as the as of time.
write(long, Map<TopicPartition, Long>) - Method in interface org.apache.crunch.kafka.offset.OffsetWriter
Persists the offsets to a configured location with metadata of asOfTime indicating the time in milliseconds when the offsets were meaningful.
write(Target) - Method in interface org.apache.crunch.lambda.LCollection
Write this collection to the specified Target
write(Target, Target.WriteMode) - Method in interface org.apache.crunch.lambda.LCollection
Write this collection to the specified Target with the given Target.WriteMode
write(Target) - Method in interface org.apache.crunch.lambda.LTable
Write this table to the Target supplied.
write(Target, Target.WriteMode) - Method in interface org.apache.crunch.lambda.LTable
Write this table to the Target supplied.
write(Target) - Method in interface org.apache.crunch.PCollection
Write the contents of this PCollection to the given Target, using the storage format specified by the target.
write(Target, Target.WriteMode) - Method in interface org.apache.crunch.PCollection
Write the contents of this PCollection to the given Target, using the given Target.WriteMode to handle existing targets.
write(PCollection<?>, Target) - Method in interface org.apache.crunch.Pipeline
Write the given collection to the given target on the next pipeline run.
write(PCollection<?>, Target, Target.WriteMode) - Method in interface org.apache.crunch.Pipeline
Write the contents of the PCollection to the given Target, using the storage format specified by the target and the given WriteMode for cases where the referenced Target already exists.
write(Target) - Method in interface org.apache.crunch.PTable
Writes this PTable to the given Target.
write(Target, Target.WriteMode) - Method in interface org.apache.crunch.PTable
Writes this PTable to the given Target, using the given Target.WriteMode to handle existing targets.
write(DataOutput) - Method in class org.apache.crunch.types.writable.TupleWritable
Writes each Writable to out.
write(DataOutput) - Method in class org.apache.crunch.types.writable.UnionWritable
 
write(PCollection<?>, Target) - Method in class org.apache.crunch.util.CrunchTool
 
write(Configuration, Path, Object) - Static method in class org.apache.crunch.util.DistCache
 
writeConnectionPropertiesToBundle(Properties, FormatBundle) - Static method in class org.apache.crunch.kafka.inputformat.KafkaInputFormat
Writes the Kafka connection properties to the bundle.
writeOffsetsToBundle(Map<TopicPartition, Pair<Long, Long>>, FormatBundle) - Static method in class org.apache.crunch.kafka.inputformat.KafkaInputFormat
Writes the start and end offsets for the provided topic partitions to the bundle.
writeOffsetsToConfiguration(Map<TopicPartition, Pair<Long, Long>>, Configuration) - Static method in class org.apache.crunch.kafka.inputformat.KafkaInputFormat
Writes the start and end offsets for the provided topic partitions to the config.
writeTextFile(PCollection<T>, String) - Method in class org.apache.crunch.impl.dist.DistributedPipeline
 
writeTextFile(PCollection<T>, String) - Method in class org.apache.crunch.impl.mem.MemPipeline
 
writeTextFile(PCollection<T>, String) - Method in interface org.apache.crunch.Pipeline
A convenience method for writing a text file.
writeTextFile(PCollection<?>, String) - Method in class org.apache.crunch.util.CrunchTool
 

X

xboolean() - Static method in class org.apache.crunch.contrib.text.Extractors
Returns an Extractor for booleans.
xboolean(Boolean) - Static method in class org.apache.crunch.contrib.text.Extractors
 
xcollect(TokenizerFactory, Extractor<T>) - Static method in class org.apache.crunch.contrib.text.Extractors
 
xcustom(Class<T>, TokenizerFactory, Extractor...) - Static method in class org.apache.crunch.contrib.text.Extractors
Returns an Extractor for a subclass of Tuple with a constructor that has the given extractor types that uses the given TokenizerFactory for parsing the sub-fields.
xdouble() - Static method in class org.apache.crunch.contrib.text.Extractors
Returns an Extractor for doubles.
xdouble(Double) - Static method in class org.apache.crunch.contrib.text.Extractors
 
xfloat() - Static method in class org.apache.crunch.contrib.text.Extractors
Returns an Extractor for floats.
xfloat(Float) - Static method in class org.apache.crunch.contrib.text.Extractors
 
xint() - Static method in class org.apache.crunch.contrib.text.Extractors
Returns an Extractor for integers.
xint(Integer) - Static method in class org.apache.crunch.contrib.text.Extractors
Returns an Extractor for integers.
xlong() - Static method in class org.apache.crunch.contrib.text.Extractors
Returns an Extractor for longs.
xlong(Long) - Static method in class org.apache.crunch.contrib.text.Extractors
Returns an Extractor for longs.
xpair(TokenizerFactory, Extractor<K>, Extractor<V>) - Static method in class org.apache.crunch.contrib.text.Extractors
Returns an Extractor for pairs of the given types that uses the given TokenizerFactory for parsing the sub-fields.
xquad(TokenizerFactory, Extractor<A>, Extractor<B>, Extractor<C>, Extractor<D>) - Static method in class org.apache.crunch.contrib.text.Extractors
Returns an Extractor for quads of the given types that uses the given TokenizerFactory for parsing the sub-fields.
xstring() - Static method in class org.apache.crunch.contrib.text.Extractors
Returns an Extractor for strings.
xstring(String) - Static method in class org.apache.crunch.contrib.text.Extractors
 
xtriple(TokenizerFactory, Extractor<A>, Extractor<B>, Extractor<C>) - Static method in class org.apache.crunch.contrib.text.Extractors
Returns an Extractor for triples of the given types that uses the given TokenizerFactory for parsing the sub-fields.
xtupleN(TokenizerFactory, Extractor...) - Static method in class org.apache.crunch.contrib.text.Extractors
Returns an Extractor for an arbitrary number of types that uses the given TokenizerFactory for parsing the sub-fields.

Z

zero(Map<String, Map<String, Long>>) - Method in class org.apache.crunch.impl.spark.CounterAccumulatorParam
 
A B C D E F G H I J K L M N O P Q R S T U V W X Z 
Skip navigation links

Copyright © 2017 The Apache Software Foundation. All rights reserved.