Index (Apache Crunch 0.15.0 API)

A B C D E F G H I J K L M N O P Q R S T U V W X Z

A

AbstractCompositeExtractor<T> - Class in org.apache.crunch.contrib.text: Base class for Extractor instances that delegates the parsing of fields to other Extractor instances, primarily used for constructing composite records that implement the Tuple interface.
AbstractCompositeExtractor(TokenizerFactory, List<Extractor<?>>) - Constructor for class org.apache.crunch.contrib.text.AbstractCompositeExtractor
AbstractOffsetReader - Class in org.apache.crunch.kafka.offset: Base implementation of OffsetReader
AbstractOffsetReader() - Constructor for class org.apache.crunch.kafka.offset.AbstractOffsetReader
AbstractOffsetWriter - Class in org.apache.crunch.kafka.offset: Base implementation of OffsetWriter
AbstractOffsetWriter() - Constructor for class org.apache.crunch.kafka.offset.AbstractOffsetWriter
AbstractSimpleExtractor<T> - Class in org.apache.crunch.contrib.text: Base class for the common case Extractor instances that construct a single object from a block of text stored in a String, with support for error handling and reporting.
accept(T) - Method in class org.apache.crunch.FilterFn: If true, emit the given record.
accept(PCollectionImpl.Visitor) - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
accept(OutputHandler, PType<?>) - Method in interface org.apache.crunch.Target: Checks to see if this Target instance is compatible with the given PType.
ACCEPT_ALL() - Static method in class org.apache.crunch.fn.FilterFns: Accept everything.
addAccumulator(Map<String, Map<String, Long>>, Map<String, Map<String, Long>>) - Method in class org.apache.crunch.impl.spark.CounterAccumulatorParam
addCacheFile(Path, Configuration) - Static method in class org.apache.crunch.util.DistCache
addCompletionHook(CrunchControlledJob.Hook) - Method in class org.apache.crunch.impl.mr.MRPipeline
addInPlace(Map<String, Map<String, Long>>, Map<String, Map<String, Long>>) - Method in class org.apache.crunch.impl.spark.CounterAccumulatorParam
addInputPath(Job, Path, FormatBundle, int) - Static method in class org.apache.crunch.io.CrunchInputs
addInputPaths(Job, Collection<Path>, FormatBundle, int) - Static method in class org.apache.crunch.io.CrunchInputs
addJarDirToDistributedCache(Configuration, File) - Static method in class org.apache.crunch.util.DistCache: Adds all jars under the specified directory to the distributed cache of jobs using the provided configuration.
addJarDirToDistributedCache(Configuration, String) - Static method in class org.apache.crunch.util.DistCache: Adds all jars under the directory at the specified path to the distributed cache of jobs using the provided configuration.
addJarToDistributedCache(Configuration, File) - Static method in class org.apache.crunch.util.DistCache: Adds the specified jar to the distributed cache of jobs using the provided configuration.
addJarToDistributedCache(Configuration, String) - Static method in class org.apache.crunch.util.DistCache: Adds the jar at the specified path to the distributed cache of jobs using the provided configuration.
addKafkaConnectionProperties(Properties, Configuration) - Static method in class org.apache.crunch.kafka.KafkaUtils: Adds the properties to the provided config instance.
addNamedOutput(Job, String, Class<? extends OutputFormat>, Class, Class) - Static method in class org.apache.crunch.io.CrunchOutputs
addNamedOutput(Job, String, FormatBundle<? extends OutputFormat>, Class, Class) - Static method in class org.apache.crunch.io.CrunchOutputs
addPrepareHook(CrunchControlledJob.Hook) - Method in class org.apache.crunch.impl.mr.MRPipeline
age - Variable in class org.apache.crunch.test.Person: Deprecated.
aggregate(Aggregator<S>) - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
Aggregate - Class in org.apache.crunch.lib: Methods for performing various types of aggregations over PCollection instances.
Aggregate() - Constructor for class org.apache.crunch.lib.Aggregate
aggregate(PCollection<S>, Aggregator<S>) - Static method in class org.apache.crunch.lib.Aggregate
aggregate(Aggregator<S>) - Method in interface org.apache.crunch.PCollection: Returns a PCollection that contains the result of aggregating all values in this instance.
Aggregate.PairValueComparator<K,V> - Class in org.apache.crunch.lib
Aggregate.TopKCombineFn<K,V> - Class in org.apache.crunch.lib
Aggregate.TopKFn<K,V> - Class in org.apache.crunch.lib
Aggregator<T> - Interface in org.apache.crunch: Aggregate a sequence of values into a possibly smaller sequence of the same type.
Aggregators - Class in org.apache.crunch.fn: A collection of pre-defined Aggregators.
Aggregators.SimpleAggregator<T> - Class in org.apache.crunch.fn: Base class for aggregators that do not require any initialization.
and(FilterFn<S>, FilterFn<S>) - Static method in class org.apache.crunch.fn.FilterFns: Accept an entry if all of the given filters accept it, using short-circuit evaluation.
and(FilterFn<S>...) - Static method in class org.apache.crunch.fn.FilterFns: Accept an entry if all of the given filters accept it, using short-circuit evaluation.
apply(Statement, Description) - Method in class org.apache.crunch.test.TemporaryPath
applyPTypeTransforms() - Method in interface org.apache.crunch.types.Converter: If true, convert the inputs or outputs from this Converter instance before (for outputs) or after (for inputs) using the associated PType#getInputMapFn and PType#getOutputMapFn calls.
as(PType<T>) - Method in class org.apache.crunch.types.avro.AvroTypeFamily
as(PType<T>) - Method in interface org.apache.crunch.types.PTypeFamily: Returns the equivalent of the given ptype for this family, if it exists.
as(PType<T>) - Method in class org.apache.crunch.types.writable.WritableTypeFamily
asCollection() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
asCollection() - Method in interface org.apache.crunch.PCollection
asMap() - Method in class org.apache.crunch.impl.dist.collect.PTableBase: Returns a PObject encapsulating a Map made up of the keys and values in this PTable.
asMap() - Method in interface org.apache.crunch.PTable: Returns a PObject encapsulating a Map made up of the keys and values in this PTable.
asPTable(PCollection<Pair<K, V>>) - Static method in class org.apache.crunch.lib.PTables: Convert the given PCollection<Pair<K, V>> to a PTable<K, V>.
asReadable(boolean) - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
asReadable() - Method in interface org.apache.crunch.io.ReadableSource
asReadable() - Method in class org.apache.crunch.kafka.KafkaSource
asReadable(boolean) - Method in interface org.apache.crunch.PCollection
asSourceTarget(PType<T>) - Method in interface org.apache.crunch.Target: Attempt to create the SourceTarget type that corresponds to this Target for the given PType, if possible.
At - Class in org.apache.crunch.io: Static factory methods for creating common SourceTarget types, which may be treated as both a Source and a Target.
At() - Constructor for class org.apache.crunch.io.At
Average - Class in org.apache.crunch.lib
Average() - Constructor for class org.apache.crunch.lib.Average
AverageBytesByIP - Class in org.apache.crunch.examples
AverageBytesByIP() - Constructor for class org.apache.crunch.examples.AverageBytesByIP
AVRO_MODE_PROPERTY - Static variable in class org.apache.crunch.types.avro.AvroMode
AVRO_SHUFFLE_MODE_PROPERTY - Static variable in class org.apache.crunch.types.avro.AvroMode
AvroDerivedValueDeepCopier<T,S> - Class in org.apache.crunch.types.avro: A DeepCopier specific to Avro derived types.
AvroDerivedValueDeepCopier(MapFn<T, S>, MapFn<S, T>, AvroType<S>) - Constructor for class org.apache.crunch.types.avro.AvroDerivedValueDeepCopier
avroFile(String, Class<T>) - Static method in class org.apache.crunch.io.At: Creates a SourceTarget<T> instance from the Avro file(s) at the given path name.
avroFile(Path, Class<T>) - Static method in class org.apache.crunch.io.At: Creates a SourceTarget<T> instance from the Avro file(s) at the given Path.
avroFile(String) - Static method in class org.apache.crunch.io.At: Creates a SourceTarget<GenericData.Record> by reading the schema of the Avro file at the given path.
avroFile(Path) - Static method in class org.apache.crunch.io.At: Creates a SourceTarget<GenericData.Record> by reading the schema of the Avro file at the given path.
avroFile(Path, Configuration) - Static method in class org.apache.crunch.io.At: Creates a SourceTarget<GenericData.Record> by reading the schema of the Avro file at the given path using the FileSystem information contained in the given Configuration instance.
avroFile(String, PType<T>) - Static method in class org.apache.crunch.io.At: Creates a SourceTarget<T> instance from the Avro file(s) at the given path name.
avroFile(Path, PType<T>) - Static method in class org.apache.crunch.io.At: Creates a SourceTarget<T> instance from the Avro file(s) at the given Path.
avroFile(String, Class<T>) - Static method in class org.apache.crunch.io.From: Creates a Source<T> instance from the Avro file(s) at the given path name.
avroFile(Path, Class<T>) - Static method in class org.apache.crunch.io.From: Creates a Source<T> instance from the Avro file(s) at the given Path.
avroFile(List<Path>, Class<T>) - Static method in class org.apache.crunch.io.From: Creates a Source<T> instance from the Avro file(s) at the given Paths.
avroFile(String, PType<T>) - Static method in class org.apache.crunch.io.From: Creates a Source<T> instance from the Avro file(s) at the given path name.
avroFile(Path, PType<T>) - Static method in class org.apache.crunch.io.From: Creates a Source<T> instance from the Avro file(s) at the given Path.
avroFile(List<Path>, PType<T>) - Static method in class org.apache.crunch.io.From: Creates a Source<T> instance from the Avro file(s) at the given Paths.
avroFile(String) - Static method in class org.apache.crunch.io.From: Creates a Source<GenericData.Record> by reading the schema of the Avro file at the given path.
avroFile(Path) - Static method in class org.apache.crunch.io.From: Creates a Source<GenericData.Record> by reading the schema of the Avro file at the given path.
avroFile(List<Path>) - Static method in class org.apache.crunch.io.From: Creates a Source<GenericData.Record> by reading the schema of the Avro file at the given paths.
avroFile(Path, Configuration) - Static method in class org.apache.crunch.io.From: Creates a Source<GenericData.Record> by reading the schema of the Avro file at the given path using the FileSystem information contained in the given Configuration instance.
avroFile(List<Path>, Configuration) - Static method in class org.apache.crunch.io.From: Creates a Source<GenericData.Record> by reading the schema of the Avro file at the given paths using the FileSystem information contained in the given Configuration instance.
avroFile(String) - Static method in class org.apache.crunch.io.To: Creates a Target at the given path name that writes data to Avro files.
avroFile(Path) - Static method in class org.apache.crunch.io.To: Creates a Target at the given Path that writes data to Avro files.
AvroGenericFn(int[], Schema) - Constructor for class org.apache.crunch.lib.sort.SortFns.AvroGenericFn
AvroIndexedRecordPartitioner() - Constructor for class org.apache.crunch.lib.join.JoinUtils.AvroIndexedRecordPartitioner
AvroInputFormat<T> - Class in org.apache.crunch.types.avro: An InputFormat for Avro data files.
AvroInputFormat() - Constructor for class org.apache.crunch.types.avro.AvroInputFormat
AvroMode - Class in org.apache.crunch.types.avro: AvroMode is an immutable object used for configuring the reading and writing of Avro types.
AvroMode.ModeType - Enum in org.apache.crunch.types.avro: Internal enum which represents the various Avro data types.
AvroOutputFormat<T> - Class in org.apache.crunch.types.avro: An OutputFormat for Avro data files.
AvroOutputFormat() - Constructor for class org.apache.crunch.types.avro.AvroOutputFormat
AvroPairGroupingComparator() - Constructor for class org.apache.crunch.lib.join.JoinUtils.AvroPairGroupingComparator
AvroPathPerKeyOutputFormat<T> - Class in org.apache.crunch.types.avro: A FileOutputFormat that takes in a Utf8 and an Avro record and writes the Avro records to a sub-directory of the output path whose name is equal to the string-form of the Utf8.
AvroPathPerKeyOutputFormat() - Constructor for class org.apache.crunch.types.avro.AvroPathPerKeyOutputFormat
Avros - Class in org.apache.crunch.types.avro: Defines static methods that are analogous to the methods defined in AvroTypeFamily for convenient static importing.
AvroSerDe<T> - Class in org.apache.crunch.impl.spark.serde
AvroSerDe(AvroType<T>, Map<String, String>) - Constructor for class org.apache.crunch.impl.spark.serde.AvroSerDe
avroTableFile(Path, PTableType<K, V>) - Static method in class org.apache.crunch.io.From: Creates a TableSource<K,V> for reading an Avro key/value file at the given path.
avroTableFile(List<Path>, PTableType<K, V>) - Static method in class org.apache.crunch.io.From: Creates a TableSource<K,V> for reading an Avro key/value file at the given paths.
AvroTextOutputFormat<K,V> - Class in org.apache.crunch.types.avro
AvroTextOutputFormat() - Constructor for class org.apache.crunch.types.avro.AvroTextOutputFormat
AvroType<T> - Class in org.apache.crunch.types.avro: The implementation of the PType interface for Avro-based serialization.
AvroType(Class<T>, Schema, DeepCopier<T>, PType...) - Constructor for class org.apache.crunch.types.avro.AvroType
AvroType(Class<T>, Schema, MapFn, MapFn, DeepCopier<T>, AvroType.AvroRecordType, PType...) - Constructor for class org.apache.crunch.types.avro.AvroType
AvroType.AvroRecordType - Enum in org.apache.crunch.types.avro
AvroTypeFamily - Class in org.apache.crunch.types.avro
AvroUtf8InputFormat - Class in org.apache.crunch.types.avro: An InputFormat for text files.
AvroUtf8InputFormat() - Constructor for class org.apache.crunch.types.avro.AvroUtf8InputFormat

B

BaseDoCollection<S> - Class in org.apache.crunch.impl.dist.collect
BaseDoTable<K,V> - Class in org.apache.crunch.impl.dist.collect
BaseGroupedTable<K,V> - Class in org.apache.crunch.impl.dist.collect
BaseInputCollection<S> - Class in org.apache.crunch.impl.dist.collect
BaseInputCollection(Source<S>, DistributedPipeline) - Constructor for class org.apache.crunch.impl.dist.collect.BaseInputCollection
BaseInputCollection(Source<S>, String, DistributedPipeline, ParallelDoOptions) - Constructor for class org.apache.crunch.impl.dist.collect.BaseInputCollection
BaseInputTable<K,V> - Class in org.apache.crunch.impl.dist.collect
BaseInputTable(TableSource<K, V>, DistributedPipeline) - Constructor for class org.apache.crunch.impl.dist.collect.BaseInputTable
BaseInputTable(TableSource<K, V>, String, DistributedPipeline, ParallelDoOptions) - Constructor for class org.apache.crunch.impl.dist.collect.BaseInputTable
BaseUnionCollection<S> - Class in org.apache.crunch.impl.dist.collect
BaseUnionTable<K,V> - Class in org.apache.crunch.impl.dist.collect
bigDecimal(PTypeFamily) - Static method in class org.apache.crunch.types.PTypes: A PType for Java's BigDecimal type.
BIGDECIMAL_TO_BYTE - Static variable in class org.apache.crunch.types.PTypes
bigInt(PTypeFamily) - Static method in class org.apache.crunch.types.PTypes: A PType for Java's BigInteger type.
BIGINT_TO_BYTE - Static variable in class org.apache.crunch.types.PTypes
BinarySearchNode(K[], RawComparator<K>) - Constructor for class org.apache.crunch.lib.sort.TotalOrderPartitioner.BinarySearchNode
BloomFilterFactory - Class in org.apache.crunch.contrib.bloomfilter: Factory Class for creating BloomFilters.
BloomFilterFactory() - Constructor for class org.apache.crunch.contrib.bloomfilter.BloomFilterFactory
BloomFilterFn<S> - Class in org.apache.crunch.contrib.bloomfilter: The class is responsible for generating keys that are used in a BloomFilter
BloomFilterFn() - Constructor for class org.apache.crunch.contrib.bloomfilter.BloomFilterFn
BloomFilterJoinStrategy<K,U,V> - Class in org.apache.crunch.lib.join: Join strategy that uses a Bloom filter that is trained on the keys of the left-side table to filter the key/value pairs of the right-side table before sending through the shuffle and reduce phase.
BloomFilterJoinStrategy(int) - Constructor for class org.apache.crunch.lib.join.BloomFilterJoinStrategy: Instantiate with the expected number of unique keys in the left table.
BloomFilterJoinStrategy(int, float) - Constructor for class org.apache.crunch.lib.join.BloomFilterJoinStrategy: Instantiate with the expected number of unique keys in the left table, and the acceptable false positive rate for the Bloom filter.
BloomFilterJoinStrategy(int, float, JoinStrategy<K, U, V>) - Constructor for class org.apache.crunch.lib.join.BloomFilterJoinStrategy: Instantiate with the expected number of unique keys in the left table, and the acceptable false positive rate for the Bloom filter, and an underlying join strategy to delegate to.
booleans() - Static method in class org.apache.crunch.types.avro.Avros
booleans() - Method in class org.apache.crunch.types.avro.AvroTypeFamily
booleans() - Method in interface org.apache.crunch.types.PTypeFamily
booleans() - Static method in class org.apache.crunch.types.writable.Writables
booleans() - Method in class org.apache.crunch.types.writable.WritableTypeFamily
bottom(int) - Method in class org.apache.crunch.impl.dist.collect.PTableBase
bottom(int) - Method in interface org.apache.crunch.PTable: Returns a PTable made up of the pairs in this PTable with the smallest value field.
build() - Method in class org.apache.crunch.CachingOptions.Builder
build() - Method in class org.apache.crunch.contrib.text.TokenizerFactory.Builder: Returns a new TokenizerFactory with settings determined by this Builder instance.
build() - Method in class org.apache.crunch.GroupingOptions.Builder
build() - Method in class org.apache.crunch.kafka.offset.hdfs.Offsets.Builder: Builds an instance.
build() - Method in class org.apache.crunch.kafka.offset.hdfs.Offsets.PartitionOffset.Builder: Builds a PartitionOffset instance.
build() - Method in class org.apache.crunch.ParallelDoOptions.Builder
build() - Method in class org.apache.crunch.test.Employee.Builder
build() - Method in class org.apache.crunch.test.Person.Builder
builder() - Static method in class org.apache.crunch.CachingOptions: Creates a new CachingOptions.Builder instance to use for specifying the caching options for a particular PCollection<T>.
Builder() - Constructor for class org.apache.crunch.CachingOptions.Builder
Builder(Class<T>) - Constructor for class org.apache.crunch.contrib.io.jdbc.DataBaseSource.Builder
builder() - Static method in class org.apache.crunch.contrib.text.TokenizerFactory: Factory method for creating a TokenizerFactory.Builder instance.
Builder() - Constructor for class org.apache.crunch.contrib.text.TokenizerFactory.Builder
builder() - Static method in class org.apache.crunch.GroupingOptions
Builder() - Constructor for class org.apache.crunch.GroupingOptions.Builder
Builder() - Constructor for class org.apache.crunch.kafka.offset.hdfs.Offsets.Builder
Builder() - Constructor for class org.apache.crunch.kafka.offset.hdfs.Offsets.PartitionOffset.Builder
builder() - Static method in class org.apache.crunch.ParallelDoOptions
Builder() - Constructor for class org.apache.crunch.ParallelDoOptions.Builder
bundle - Variable in class org.apache.crunch.io.CrunchOutputs.OutputConfig
by(MapFn<S, K>, PType<K>) - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
by(String, MapFn<S, K>, PType<K>) - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
by(SFunction<S, K>, PType<K>) - Method in interface org.apache.crunch.lambda.LCollection: Key this LCollection by a key extracted from the element to yield a LTable mapping the key to the whole element.
by(int, Sort.Order) - Static method in class org.apache.crunch.lib.Sort.ColumnOrder
by(MapFn<S, K>, PType<K>) - Method in interface org.apache.crunch.PCollection: Apply the given map function to each element of this instance in order to create a PTable.
by(String, MapFn<S, K>, PType<K>) - Method in interface org.apache.crunch.PCollection: Apply the given map function to each element of this instance in order to create a PTable.
BYTE_TO_BIGDECIMAL - Static variable in class org.apache.crunch.types.PTypes
BYTE_TO_BIGINT - Static variable in class org.apache.crunch.types.PTypes
ByteArray - Class in org.apache.crunch.impl.spark
ByteArray(byte[], ByteArrayHelper) - Constructor for class org.apache.crunch.impl.spark.ByteArray
ByteArrayHelper - Class in org.apache.crunch.impl.spark
ByteArrayHelper() - Constructor for class org.apache.crunch.impl.spark.ByteArrayHelper
bytes() - Static method in class org.apache.crunch.types.avro.Avros
bytes() - Method in class org.apache.crunch.types.avro.AvroTypeFamily
bytes() - Method in interface org.apache.crunch.types.PTypeFamily
bytes() - Static method in class org.apache.crunch.types.writable.Writables
bytes() - Method in class org.apache.crunch.types.writable.WritableTypeFamily
BYTES_IN - Static variable in class org.apache.crunch.types.avro.Avros
BYTES_PER_REDUCE_TASK - Static variable in class org.apache.crunch.util.PartitionUtils
BytesDeserializer() - Constructor for class org.apache.crunch.kafka.KafkaSource.BytesDeserializer

C

cache() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
cache(CachingOptions) - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
cache() - Method in class org.apache.crunch.impl.dist.collect.PTableBase
cache(CachingOptions) - Method in class org.apache.crunch.impl.dist.collect.PTableBase
cache(PCollection<T>, CachingOptions) - Method in class org.apache.crunch.impl.mem.MemPipeline
cache(PCollection<T>, CachingOptions) - Method in class org.apache.crunch.impl.mr.MRPipeline
cache(PCollection<T>, CachingOptions) - Method in class org.apache.crunch.impl.spark.SparkPipeline
cache() - Method in interface org.apache.crunch.lambda.LCollection: Cache the underlying PCollection
cache(CachingOptions) - Method in interface org.apache.crunch.lambda.LCollection: Cache the underlying PCollection
cache() - Method in interface org.apache.crunch.PCollection: Marks this data as cached using the default CachingOptions.
cache(CachingOptions) - Method in interface org.apache.crunch.PCollection: Marks this data as cached using the given CachingOptions.
cache(PCollection<T>, CachingOptions) - Method in interface org.apache.crunch.Pipeline: Caches the given PCollection so that it will be processed at most once during pipeline execution.
cache() - Method in interface org.apache.crunch.PTable
cache(CachingOptions) - Method in interface org.apache.crunch.PTable
CachingOptions - Class in org.apache.crunch: Options for controlling how a PCollection<T> is cached for subsequent processing.
CachingOptions.Builder - Class in org.apache.crunch: A Builder class to use for setting the CachingOptions for a PCollection.
call(Tuple2<IntByteArray, List<byte[]>>) - Method in class org.apache.crunch.impl.spark.collect.ToByteArrayFunction
call(Iterator<Tuple2<K, V>>) - Method in class org.apache.crunch.impl.spark.fn.CombineMapsideFunction
call(Iterator<Pair<K, V>>) - Method in class org.apache.crunch.impl.spark.fn.CrunchPairTuple2
call(Integer, Iterator) - Method in class org.apache.crunch.impl.spark.fn.FlatMapIndexFn
call(Iterator<Tuple2<K, V>>) - Method in class org.apache.crunch.impl.spark.fn.FlatMapPairDoFn
call(Tuple2<K, V>) - Method in class org.apache.crunch.impl.spark.fn.InputConverterFunction
call(Object) - Method in class org.apache.crunch.impl.spark.fn.MapFunction
call(Pair<K, V>) - Method in class org.apache.crunch.impl.spark.fn.MapOutputFunction
call(S) - Method in class org.apache.crunch.impl.spark.fn.OutputConverterFunction
call(Iterator<T>) - Method in class org.apache.crunch.impl.spark.fn.PairFlatMapDoFn
call(Tuple2<K, V>) - Method in class org.apache.crunch.impl.spark.fn.PairMapFunction
call(Pair<K, List<V>>) - Method in class org.apache.crunch.impl.spark.fn.PairMapIterableFunction
call(Pair<K, V>) - Method in class org.apache.crunch.impl.spark.fn.PartitionedMapOutputFunction
call(Iterator<Tuple2<ByteArray, List<byte[]>>>) - Method in class org.apache.crunch.impl.spark.fn.ReduceGroupingFunction
call(Tuple2<ByteArray, Iterable<byte[]>>) - Method in class org.apache.crunch.impl.spark.fn.ReduceInputFunction
call(Pair<K, V>) - Method in class org.apache.crunch.impl.spark.fn.Tuple2MapFunction
CAN_COMBINE_SPECIFIC_AND_REFLECT_SCHEMAS - Static variable in class org.apache.crunch.types.avro.Avros: Older versions of Avro (i.e., before 1.7.0) do not support schemas that are composed of a mix of specific and reflection-based schemas.
Cartesian - Class in org.apache.crunch.lib: Utilities for Cartesian products of two PTable or PCollection instances.
Cartesian() - Constructor for class org.apache.crunch.lib.Cartesian
Channels - Class in org.apache.crunch.lib: Utilities for splitting Pair instances emitted by DoFn into separate PCollection instances.
Channels() - Constructor for class org.apache.crunch.lib.Channels
checkCombiningSpecificAndReflectionSchemas() - Static method in class org.apache.crunch.types.avro.Avros
checkOutputSpecs(JobContext) - Static method in class org.apache.crunch.io.CrunchOutputs
ClassloaderFallbackObjectInputStream - Class in org.apache.crunch.util: A custom ObjectInputStream that falls back to the thread context classloader if the class can't be found with the usual classloader that ObjectInputStream uses.
ClassloaderFallbackObjectInputStream(InputStream) - Constructor for class org.apache.crunch.util.ClassloaderFallbackObjectInputStream
cleanup(Emitter<Pair<String, BloomFilter>>) - Method in class org.apache.crunch.contrib.bloomfilter.BloomFilterFn
cleanup(Emitter<T>) - Method in class org.apache.crunch.DoFn: Called during the cleanup of the MapReduce job this DoFn is associated with.
cleanup(Emitter<T>) - Method in class org.apache.crunch.FilterFn
cleanup() - Method in class org.apache.crunch.FilterFn: Called during the cleanup of the MapReduce job this FilterFn is associated with.
cleanup(Emitter<T>) - Method in class org.apache.crunch.fn.CompositeMapFn
cleanup(Emitter<Pair<S, T>>) - Method in class org.apache.crunch.fn.PairMapFn
cleanup(boolean) - Method in class org.apache.crunch.impl.dist.DistributedPipeline
cleanup(boolean) - Method in class org.apache.crunch.impl.mem.MemPipeline
cleanup(Emitter<Pair<Integer, Pair<K, V>>>) - Method in class org.apache.crunch.lib.Aggregate.TopKFn
cleanup(Emitter<Pair<K, Pair<U, V>>>) - Method in class org.apache.crunch.lib.join.FullOuterJoinFn: Called during the cleanup of the MapReduce job this DoFn is associated with.
cleanup(Emitter<Pair<K, Pair<U, V>>>) - Method in class org.apache.crunch.lib.join.LeftOuterJoinFn: Called during the cleanup of the MapReduce job this DoFn is associated with.
cleanup(boolean) - Method in interface org.apache.crunch.Pipeline: Cleans up any artifacts created as a result of running the pipeline.
clear() - Method in class org.apache.crunch.types.writable.TupleWritable
clearAge() - Method in class org.apache.crunch.test.Person.Builder: Clears the value of the 'age' field
clearCounters() - Static method in class org.apache.crunch.impl.mem.MemPipeline
clearCounters() - Static method in class org.apache.crunch.test.TestCounters
clearDepartment() - Method in class org.apache.crunch.test.Employee.Builder: Clears the value of the 'department' field
clearName() - Method in class org.apache.crunch.test.Employee.Builder: Clears the value of the 'name' field
clearName() - Method in class org.apache.crunch.test.Person.Builder: Clears the value of the 'name' field
clearSalary() - Method in class org.apache.crunch.test.Employee.Builder: Clears the value of the 'salary' field
clearSiblingnames() - Method in class org.apache.crunch.test.Person.Builder: Clears the value of the 'siblingnames' field
close() - Method in class org.apache.crunch.io.CrunchOutputs
close() - Method in class org.apache.crunch.kafka.inputformat.KafkaRecordReader
close() - Method in class org.apache.crunch.kafka.KafkaSource.BytesDeserializer
close() - Method in class org.apache.crunch.kafka.offset.hdfs.HDFSOffsetReader
close() - Method in class org.apache.crunch.kafka.offset.hdfs.HDFSOffsetWriter
cogroup(PTable<K, U>) - Method in class org.apache.crunch.impl.dist.collect.PTableBase
cogroup(LTable<K, U>) - Method in interface org.apache.crunch.lambda.LTable: Cogroup this table with another LTable with the same key type, collecting the set of values from each side.
Cogroup - Class in org.apache.crunch.lib
Cogroup() - Constructor for class org.apache.crunch.lib.Cogroup
cogroup(PTable<K, U>, PTable<K, V>) - Static method in class org.apache.crunch.lib.Cogroup: Co-groups the two PTable arguments.
cogroup(int, PTable<K, U>, PTable<K, V>) - Static method in class org.apache.crunch.lib.Cogroup: Co-groups the two PTable arguments with a user-specified degree of parallelism (a.k.a, number of reducers.)
cogroup(PTable<K, V1>, PTable<K, V2>, PTable<K, V3>) - Static method in class org.apache.crunch.lib.Cogroup: Co-groups the three PTable arguments.
cogroup(int, PTable<K, V1>, PTable<K, V2>, PTable<K, V3>) - Static method in class org.apache.crunch.lib.Cogroup: Co-groups the three PTable arguments with a user-specified degree of parallelism (a.k.a, number of reducers.)
cogroup(PTable<K, V1>, PTable<K, V2>, PTable<K, V3>, PTable<K, V4>) - Static method in class org.apache.crunch.lib.Cogroup: Co-groups the three PTable arguments.
cogroup(int, PTable<K, V1>, PTable<K, V2>, PTable<K, V3>, PTable<K, V4>) - Static method in class org.apache.crunch.lib.Cogroup: Co-groups the three PTable arguments with a user-specified degree of parallelism (a.k.a, number of reducers.)
cogroup(PTable<K, ?>, PTable<K, ?>...) - Static method in class org.apache.crunch.lib.Cogroup: Co-groups an arbitrary number of PTable arguments.
cogroup(int, PTable<K, ?>, PTable<K, ?>...) - Static method in class org.apache.crunch.lib.Cogroup: Co-groups an arbitrary number of PTable arguments with a user-specified degree of parallelism (a.k.a, number of reducers.) The largest table should come last in the ordering.
cogroup(PTable<K, U>) - Method in interface org.apache.crunch.PTable: Co-group operation with the given table.
Collect(Collection<V1>, Collection<V2>, Collection<V3>) - Constructor for class org.apache.crunch.Tuple3.Collect
Collect(Collection<V1>, Collection<V2>, Collection<V3>, Collection<V4>) - Constructor for class org.apache.crunch.Tuple4.Collect
collectAllValues() - Method in interface org.apache.crunch.lambda.LGroupedTable: Collect all values for each key into a Collection
CollectionDeepCopier<T> - Class in org.apache.crunch.types: Performs deep copies (based on underlying PType deep copying) of Collections.
CollectionDeepCopier(PType<T>) - Constructor for class org.apache.crunch.types.CollectionDeepCopier
collectionOf(T...) - Static method in class org.apache.crunch.impl.mem.MemPipeline
collectionOf(Iterable<T>) - Static method in class org.apache.crunch.impl.mem.MemPipeline
collections(PType<T>) - Static method in class org.apache.crunch.types.avro.Avros
collections(PType<T>) - Method in class org.apache.crunch.types.avro.AvroTypeFamily
collections(PType<T>) - Method in interface org.apache.crunch.types.PTypeFamily
collections(PType<T>) - Static method in class org.apache.crunch.types.writable.Writables
collections(PType<T>) - Method in class org.apache.crunch.types.writable.WritableTypeFamily
collectUniqueValues() - Method in interface org.apache.crunch.lambda.LGroupedTable: Collect all unique values for each key into a Collection (note that the value type must have a correctly- defined equals() and hashcode().
collectValues() - Method in class org.apache.crunch.impl.dist.collect.PTableBase
collectValues(SSupplier<C>, SBiConsumer<C, V>, PType<C>) - Method in interface org.apache.crunch.lambda.LGroupedTable: Collect the values into an aggregate type.
collectValues(PTable<K, V>) - Static method in class org.apache.crunch.lib.Aggregate
collectValues() - Method in interface org.apache.crunch.PTable: Aggregate all of the values with the same key into a single key-value pair in the returned PTable.
column() - Method in class org.apache.crunch.lib.Sort.ColumnOrder
ColumnOrder(int, Sort.Order) - Constructor for class org.apache.crunch.lib.Sort.ColumnOrder
CombineFn<S,T> - Class in org.apache.crunch: A special DoFn implementation that converts an Iterable of values into a single value.
CombineFn() - Constructor for class org.apache.crunch.CombineFn
CombineMapsideFunction<K,V> - Class in org.apache.crunch.impl.spark.fn
CombineMapsideFunction(CombineFn<K, V>, SparkRuntimeContext) - Constructor for class org.apache.crunch.impl.spark.fn.CombineMapsideFunction
combineValues(CombineFn<K, V>, CombineFn<K, V>) - Method in class org.apache.crunch.impl.dist.collect.BaseGroupedTable
combineValues(CombineFn<K, V>) - Method in class org.apache.crunch.impl.dist.collect.BaseGroupedTable
combineValues(Aggregator<V>) - Method in class org.apache.crunch.impl.dist.collect.BaseGroupedTable
combineValues(Aggregator<V>, Aggregator<V>) - Method in class org.apache.crunch.impl.dist.collect.BaseGroupedTable
combineValues(Aggregator<V>) - Method in interface org.apache.crunch.lambda.LGroupedTable: Combine the value part of the table using the provided Crunch Aggregator.
combineValues(SSupplier<A>, SBiFunction<A, V, A>, SFunction<A, Iterable<V>>) - Method in interface org.apache.crunch.lambda.LGroupedTable: Combine the value part of the table using the given functions.
combineValues(CombineFn<K, V>) - Method in interface org.apache.crunch.PGroupedTable: Combines the values of this grouping using the given CombineFn.
combineValues(CombineFn<K, V>, CombineFn<K, V>) - Method in interface org.apache.crunch.PGroupedTable: Combines and reduces the values of this grouping using the given CombineFn instances.
combineValues(Aggregator<V>) - Method in interface org.apache.crunch.PGroupedTable: Combine the values in each group using the given Aggregator.
combineValues(Aggregator<V>, Aggregator<V>) - Method in interface org.apache.crunch.PGroupedTable: Combine and reduces the values in each group using the given Aggregator instances.
comm(PCollection<T>, PCollection<T>) - Static method in class org.apache.crunch.lib.Set: Find the elements that are common to two sets, like the Unix comm utility.
Comparator() - Constructor for class org.apache.crunch.types.writable.TupleWritable.Comparator
compare(ByteArray, ByteArray) - Method in class org.apache.crunch.impl.spark.SparkComparator
compare(Pair<K, V>, Pair<K, V>) - Method in class org.apache.crunch.lib.Aggregate.PairValueComparator
compare(AvroWrapper<T>, AvroWrapper<T>) - Method in class org.apache.crunch.lib.join.JoinUtils.AvroPairGroupingComparator
compare(byte[], int, int, byte[], int, int) - Method in class org.apache.crunch.lib.join.JoinUtils.AvroPairGroupingComparator
compare(TupleWritable, TupleWritable) - Method in class org.apache.crunch.lib.join.JoinUtils.TupleWritableComparator
compare(byte[], int, int, byte[], int, int) - Method in class org.apache.crunch.lib.join.JoinUtils.TupleWritableComparator
compare(AvroKey<T>, AvroKey<T>) - Method in class org.apache.crunch.lib.sort.ReverseAvroComparator
compare(byte[], int, int, byte[], int, int) - Method in class org.apache.crunch.lib.sort.ReverseAvroComparator
compare(byte[], int, int, byte[], int, int) - Method in class org.apache.crunch.lib.sort.ReverseWritableComparator
compare(T, T) - Method in class org.apache.crunch.lib.sort.ReverseWritableComparator
compare(WritableComparable, WritableComparable) - Method in class org.apache.crunch.lib.sort.TupleWritableComparator
compare(byte[], int, int, byte[], int, int) - Method in class org.apache.crunch.types.writable.TupleWritable.Comparator
compare(WritableComparable, WritableComparable) - Method in class org.apache.crunch.types.writable.TupleWritable.Comparator
compareTo(ByteArray) - Method in class org.apache.crunch.impl.spark.ByteArray
compareTo(Offsets.PartitionOffset) - Method in class org.apache.crunch.kafka.offset.hdfs.Offsets.PartitionOffset
compareTo(Pair<K, V>) - Method in class org.apache.crunch.Pair
compareTo(TupleWritable) - Method in class org.apache.crunch.types.writable.TupleWritable
compareTo(UnionWritable) - Method in class org.apache.crunch.types.writable.UnionWritable
CompositeMapFn<R,S,T> - Class in org.apache.crunch.fn
CompositeMapFn(MapFn<R, S>, MapFn<S, T>) - Constructor for class org.apache.crunch.fn.CompositeMapFn
CompositePathIterable<T> - Class in org.apache.crunch.io
Compress - Class in org.apache.crunch.io: Helper functions for compressing output data.
Compress() - Constructor for class org.apache.crunch.io.Compress
compress(T, Class<? extends CompressionCodec>) - Static method in class org.apache.crunch.io.Compress: Configure the given output target to be compressed using the given codec.
conf(String, String) - Method in class org.apache.crunch.GroupingOptions.Builder
conf(String, String) - Method in class org.apache.crunch.ParallelDoOptions.Builder: Specifies key-value pairs that should be added to the Configuration object associated with the Job that includes these options.
conf(String, String) - Method in interface org.apache.crunch.SourceTarget: Adds the given key-value pair to the Configuration instance(s) that are used to read and write this SourceTarget<T>.
configure(Configuration) - Method in class org.apache.crunch.DoFn: Configure this DoFn.
configure(Configuration) - Method in class org.apache.crunch.fn.CompositeMapFn
configure(Configuration) - Method in class org.apache.crunch.fn.ExtractKeyFn
configure(Configuration) - Method in class org.apache.crunch.fn.PairMapFn
configure(Job) - Method in class org.apache.crunch.GroupingOptions
configure(Configuration) - Method in class org.apache.crunch.io.FormatBundle
configure(Target, PType<?>) - Method in interface org.apache.crunch.io.OutputHandler
configure(Map<String, ?>, boolean) - Method in class org.apache.crunch.kafka.KafkaSource.BytesDeserializer
configure(Configuration) - Method in class org.apache.crunch.ParallelDoOptions: Applies the key-value pairs that were associated with this instance to the given Configuration object.
configure(Configuration) - Method in interface org.apache.crunch.ReadableData: Allows this instance to specify any additional configuration settings that may be needed by the job that it is launched in.
configure(FormatBundle) - Method in class org.apache.crunch.types.avro.AvroMode: Populates the bundle with mode specific settings for the specific FormatBundle.
configure(Configuration) - Method in class org.apache.crunch.types.avro.AvroMode: Populates the conf with mode specific settings.
configure(Configuration) - Method in class org.apache.crunch.types.avro.AvroUtf8InputFormat
configure(Configuration) - Method in class org.apache.crunch.types.PGroupedTableType.PairIterableMapFn
configure(Configuration) - Method in class org.apache.crunch.util.DelegatingReadableData
configure(Configuration) - Method in class org.apache.crunch.util.UnionReadableData
configureFactory(Configuration) - Method in class org.apache.crunch.types.avro.AvroMode: Deprecated.
use AvroMode.configure(org.apache.hadoop.conf.Configuration)
configureForMapReduce(Job, PType<?>, Path, String) - Method in interface org.apache.crunch.io.MapReduceTarget
configureOrdering(Configuration, WritableType[], Sort.ColumnOrder[]) - Static method in class org.apache.crunch.lib.sort.TupleWritableComparator
configureReflectDataFactory(Configuration) - Static method in class org.apache.crunch.types.avro.Avros: Deprecated.
as of 0.9.0; use AvroMode.REFLECT.configure(Configuration)
configureShuffle(Configuration) - Method in class org.apache.crunch.types.avro.AvroMode: Populates the conf with mode specific settings for use during the shuffle phase.
configureShuffle(Job, GroupingOptions) - Method in class org.apache.crunch.types.PGroupedTableType
configureSource(Job, int) - Method in class org.apache.crunch.kafka.KafkaSource
configureSource(Job, int) - Method in interface org.apache.crunch.Source: Configure the given job to use this source as an input.
CONSUMER_POLL_TIMEOUT_DEFAULT - Static variable in class org.apache.crunch.kafka.KafkaSource: Default timeout value for KafkaSource.CONSUMER_POLL_TIMEOUT_KEY of 1 second.
CONSUMER_POLL_TIMEOUT_KEY - Static variable in class org.apache.crunch.kafka.KafkaSource: Constant to indicate how long the reader waits before timing out when retrieving data from Kafka.
containers(Class<T>) - Static method in class org.apache.crunch.types.avro.Avros
containers(Class<T>) - Method in class org.apache.crunch.types.avro.AvroTypeFamily
convert(Object, ObjectInspector, ObjectInspector) - Static method in class org.apache.crunch.types.orc.OrcUtils: Convert an object from / to OrcStruct
convert(PType<T>, PTypeFamily) - Static method in class org.apache.crunch.types.PTypeUtils
Converter<K,V,S,T> - Interface in org.apache.crunch.types: Converts the input key/value from a MapReduce task into the input to a DoFn, or takes the output of a DoFn and write it to the output key/values.
convertInput(K, V) - Method in interface org.apache.crunch.types.Converter
convertIterableInput(K, Iterable<V>) - Method in interface org.apache.crunch.types.Converter
copyResourceFile(String) - Method in class org.apache.crunch.test.TemporaryPath: Copy a classpath resource to File.
copyResourceFileName(String) - Method in class org.apache.crunch.test.TemporaryPath: Copy a classpath resource returning its absolute file name.
copyResourcePath(String) - Method in class org.apache.crunch.test.TemporaryPath: Copy a classpath resource to a Path.
count() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
count() - Method in interface org.apache.crunch.lambda.LCollection: Count distict values in this LCollection, yielding an LTable mapping each value to the number of occurrences in the collection.
count(PCollection<S>) - Static method in class org.apache.crunch.lib.Aggregate: Returns a PTable that contains the unique elements of this collection mapped to a count of their occurrences.
count(PCollection<S>, int) - Static method in class org.apache.crunch.lib.Aggregate: Returns a PTable that contains the unique elements of this collection mapped to a count of their occurrences.
count - Variable in class org.apache.crunch.lib.Quantiles.Result
count() - Method in interface org.apache.crunch.PCollection: Returns a PTable instance that contains the counts of each unique element of this PCollection.
countClause - Variable in class org.apache.crunch.contrib.io.jdbc.DataBaseSource.Builder
CounterAccumulatorParam - Class in org.apache.crunch.impl.spark
CounterAccumulatorParam() - Constructor for class org.apache.crunch.impl.spark.CounterAccumulatorParam
create(String) - Method in class org.apache.crunch.contrib.text.TokenizerFactory: Return a Scanner instance that wraps the input string and uses the delimiter, skip, and locale settings for this TokenizerFactory instance.
create(Iterable<S>, PType<S>) - Method in class org.apache.crunch.impl.dist.DistributedPipeline
create(Iterable<S>, PType<S>, CreateOptions) - Method in class org.apache.crunch.impl.dist.DistributedPipeline
create(Iterable<Pair<K, V>>, PTableType<K, V>) - Method in class org.apache.crunch.impl.dist.DistributedPipeline
create(Iterable<Pair<K, V>>, PTableType<K, V>, CreateOptions) - Method in class org.apache.crunch.impl.dist.DistributedPipeline
create(Iterable<T>, PType<T>) - Method in class org.apache.crunch.impl.mem.MemPipeline
create(Iterable<T>, PType<T>, CreateOptions) - Method in class org.apache.crunch.impl.mem.MemPipeline
create(Iterable<Pair<K, V>>, PTableType<K, V>) - Method in class org.apache.crunch.impl.mem.MemPipeline
create(Iterable<Pair<K, V>>, PTableType<K, V>, CreateOptions) - Method in class org.apache.crunch.impl.mem.MemPipeline
create(PType<?>, Configuration) - Static method in class org.apache.crunch.impl.spark.serde.SerDeFactory
create(Iterable<S>, PType<S>, CreateOptions) - Method in class org.apache.crunch.impl.spark.SparkPipeline
create(Iterable<Pair<K, V>>, PTableType<K, V>, CreateOptions) - Method in class org.apache.crunch.impl.spark.SparkPipeline
create(FileSystem, Path, FileReaderFactory<S>) - Static method in class org.apache.crunch.io.CompositePathIterable
create() - Static method in class org.apache.crunch.lib.join.MapsideJoinStrategy: Create a new MapsideJoinStrategy instance that will load its left-side table into memory, and will materialize the contents of the left-side table to disk before running the in-memory join.
create(boolean) - Static method in class org.apache.crunch.lib.join.MapsideJoinStrategy: Create a new MapsideJoinStrategy instance that will load its left-side table into memory.
create(Iterable<T>, PType<T>) - Method in interface org.apache.crunch.Pipeline: Creates a PCollection containing the values found in the given Iterable using an implementation-specific distribution mechanism.
create(Iterable<T>, PType<T>, CreateOptions) - Method in interface org.apache.crunch.Pipeline: Creates a PCollection containing the values found in the given Iterable using an implementation-specific distribution mechanism.
create(Iterable<Pair<K, V>>, PTableType<K, V>) - Method in interface org.apache.crunch.Pipeline: Creates a PTable containing the values found in the given Iterable using an implementation-specific distribution mechanism.
create(Iterable<Pair<K, V>>, PTableType<K, V>, CreateOptions) - Method in interface org.apache.crunch.Pipeline: Creates a PTable containing the values found in the given Iterable using an implementation-specific distribution mechanism.
create() - Method in class org.apache.crunch.test.TemporaryPath
create() - Static method in class org.apache.crunch.types.NoOpDeepCopier: Static factory method.
create(Object...) - Method in class org.apache.crunch.types.orc.TupleObjectInspector
create(Class<T>, Class...) - Static method in class org.apache.crunch.types.TupleFactory
createBinarySerde(TypeInfo) - Static method in class org.apache.crunch.types.orc.OrcUtils: Create a binary serde for OrcStruct serialization/deserialization
CreatedCollection<T> - Class in org.apache.crunch.impl.spark.collect: Represents a Spark-based PCollection that was created from a Java Iterable of values.
CreatedCollection(SparkPipeline, Iterable<T>, PType<T>, CreateOptions) - Constructor for class org.apache.crunch.impl.spark.collect.CreatedCollection
createDoCollection(String, PCollectionImpl<S>, DoFn<S, T>, PType<T>, ParallelDoOptions) - Method in interface org.apache.crunch.impl.dist.collect.PCollectionFactory
createDoCollection(String, PCollectionImpl<S>, DoFn<S, T>, PType<T>, ParallelDoOptions) - Method in class org.apache.crunch.impl.spark.collect.SparkCollectFactory
createDoNode() - Method in interface org.apache.crunch.impl.dist.collect.MRCollection
createDoTable(String, PCollectionImpl<S>, DoFn<S, Pair<K, V>>, PTableType<K, V>, ParallelDoOptions) - Method in interface org.apache.crunch.impl.dist.collect.PCollectionFactory
createDoTable(String, PCollectionImpl<S>, CombineFn<K, V>, DoFn<S, Pair<K, V>>, PTableType<K, V>) - Method in interface org.apache.crunch.impl.dist.collect.PCollectionFactory
createDoTable(String, PCollectionImpl<S>, DoFn<S, Pair<K, V>>, PTableType<K, V>, ParallelDoOptions) - Method in class org.apache.crunch.impl.spark.collect.SparkCollectFactory
createDoTable(String, PCollectionImpl<S>, CombineFn<K, V>, DoFn<S, Pair<K, V>>, PTableType<K, V>) - Method in class org.apache.crunch.impl.spark.collect.SparkCollectFactory
CreatedTable<K,V> - Class in org.apache.crunch.impl.spark.collect: Represents a Spark-based PTable that was created from a Java Iterable of key-value pairs.
CreatedTable(SparkPipeline, Iterable<Pair<K, V>>, PTableType<K, V>, CreateOptions) - Constructor for class org.apache.crunch.impl.spark.collect.CreatedTable
createFilter(Path, BloomFilterFn<String>) - Static method in class org.apache.crunch.contrib.bloomfilter.BloomFilterFactory: The method will take an input path and generates BloomFilters for all text files in that path.
createFilter(PCollection<T>, BloomFilterFn<T>) - Static method in class org.apache.crunch.contrib.bloomfilter.BloomFilterFactory
createGroupedTable(PTableBase<K, V>, GroupingOptions) - Method in interface org.apache.crunch.impl.dist.collect.PCollectionFactory
createGroupedTable(PTableBase<K, V>, GroupingOptions) - Method in class org.apache.crunch.impl.spark.collect.SparkCollectFactory
createInputCollection(Source<S>, String, DistributedPipeline, ParallelDoOptions) - Method in interface org.apache.crunch.impl.dist.collect.PCollectionFactory
createInputCollection(Source<S>, String, DistributedPipeline, ParallelDoOptions) - Method in class org.apache.crunch.impl.spark.collect.SparkCollectFactory
createInputTable(TableSource<K, V>, String, DistributedPipeline, ParallelDoOptions) - Method in interface org.apache.crunch.impl.dist.collect.PCollectionFactory
createInputTable(TableSource<K, V>, String, DistributedPipeline, ParallelDoOptions) - Method in class org.apache.crunch.impl.spark.collect.SparkCollectFactory
createIntermediateOutput(PType<T>) - Method in class org.apache.crunch.impl.dist.DistributedPipeline
CreateOptions - Class in org.apache.crunch: Additional options that can be specified when creating a new PCollection using Pipeline.create(java.lang.Iterable<T>, org.apache.crunch.types.PType<T>).
createOrcStruct(TypeInfo, Object...) - Static method in class org.apache.crunch.types.orc.OrcUtils: Create an object of OrcStruct given a type string and a list of objects
createOrderedTupleSchema(PType<S>, Sort.ColumnOrder[]) - Static method in class org.apache.crunch.lib.sort.SortFns: Constructs an Avro schema for the given PType<S> that respects the given column orderings.
createPut(PTable<String, String>) - Method in class org.apache.crunch.examples.WordAggregationHBase: Create puts in order to insert them in hbase.
createRecordReader(InputSplit, TaskAttemptContext) - Method in class org.apache.crunch.kafka.inputformat.KafkaInputFormat
createRecordReader(InputSplit, TaskAttemptContext) - Method in class org.apache.crunch.types.avro.AvroInputFormat
createRecordReader(InputSplit, TaskAttemptContext) - Method in class org.apache.crunch.types.avro.AvroUtf8InputFormat
createSourceTarget(Configuration, Path, Iterable<T>, int) - Method in class org.apache.crunch.types.avro.AvroType
createSourceTarget(Configuration, Path, Iterable<T>, int) - Method in interface org.apache.crunch.types.PType: Returns a ReadableSource that contains the data in the given Iterable.
createSourceTarget(Configuration, Path, Iterable<T>, int) - Method in class org.apache.crunch.types.writable.WritableType
createTempPath() - Method in class org.apache.crunch.impl.dist.DistributedPipeline
createUnionCollection(List<? extends PCollectionImpl<S>>) - Method in interface org.apache.crunch.impl.dist.collect.PCollectionFactory
createUnionCollection(List<? extends PCollectionImpl<S>>) - Method in class org.apache.crunch.impl.spark.collect.SparkCollectFactory
createUnionTable(List<PTableBase<K, V>>) - Method in interface org.apache.crunch.impl.dist.collect.PCollectionFactory
createUnionTable(List<PTableBase<K, V>>) - Method in class org.apache.crunch.impl.spark.collect.SparkCollectFactory
cross(PTable<K1, U>, PTable<K2, V>) - Static method in class org.apache.crunch.lib.Cartesian: Performs a full cross join on the specified PTables (using the same strategy as Pig's CROSS operator).
cross(PTable<K1, U>, PTable<K2, V>, int) - Static method in class org.apache.crunch.lib.Cartesian: Performs a full cross join on the specified PTables (using the same strategy as Pig's CROSS operator).
cross(PCollection, PCollection<V>) - Static method in class org.apache.crunch.lib.Cartesian: Performs a full cross join on the specified PCollections (using the same strategy as Pig's CROSS operator).
cross(PCollection, PCollection<V>, int) - Static method in class org.apache.crunch.lib.Cartesian: Performs a full cross join on the specified PCollections (using the same strategy as Pig's CROSS operator).
CRUNCH_DISABLE_OUTPUT_COUNTERS - Static variable in class org.apache.crunch.io.CrunchOutputs
CRUNCH_FILTER_NAME - Static variable in class org.apache.crunch.contrib.bloomfilter.BloomFilterFn
CRUNCH_FILTER_SIZE - Static variable in class org.apache.crunch.contrib.bloomfilter.BloomFilterFn
CRUNCH_INPUTS - Static variable in class org.apache.crunch.io.CrunchInputs
CRUNCH_OUTPUTS - Static variable in class org.apache.crunch.io.CrunchOutputs
CrunchInputs - Class in org.apache.crunch.io: Helper functions for configuring multiple InputFormat instances within a single Crunch MapReduce job.
CrunchInputs() - Constructor for class org.apache.crunch.io.CrunchInputs
CrunchIterable<S,T> - Class in org.apache.crunch.impl.spark.fn
CrunchIterable(DoFn<S, T>, Iterator<S>) - Constructor for class org.apache.crunch.impl.spark.fn.CrunchIterable
CrunchOutputs<K,V> - Class in org.apache.crunch.io: An analogue of CrunchInputs for handling multiple OutputFormat instances writing to multiple files within a single MapReduce job.
CrunchOutputs(TaskInputOutputContext<?, ?, K, V>) - Constructor for class org.apache.crunch.io.CrunchOutputs: Creates and initializes multiple outputs support, it should be instantiated in the Mapper/Reducer setup method.
CrunchOutputs(Configuration) - Constructor for class org.apache.crunch.io.CrunchOutputs
CrunchOutputs.OutputConfig<K,V> - Class in org.apache.crunch.io
CrunchPairTuple2<K,V> - Class in org.apache.crunch.impl.spark.fn
CrunchPairTuple2() - Constructor for class org.apache.crunch.impl.spark.fn.CrunchPairTuple2
CrunchRuntimeException - Exception in org.apache.crunch: A RuntimeException implementation that includes some additional options for the Crunch execution engine to track reporting status.
CrunchRuntimeException(String) - Constructor for exception org.apache.crunch.CrunchRuntimeException
CrunchRuntimeException(Exception) - Constructor for exception org.apache.crunch.CrunchRuntimeException
CrunchRuntimeException(String, Exception) - Constructor for exception org.apache.crunch.CrunchRuntimeException
CrunchTestSupport - Class in org.apache.crunch.test: A temporary workaround for Scala tests to use when working with Rule annotations until it gets fixed in JUnit 4.11.
CrunchTestSupport() - Constructor for class org.apache.crunch.test.CrunchTestSupport
CrunchTool - Class in org.apache.crunch.util: An extension of the Tool interface that creates a Pipeline instance and provides methods for working with the Pipeline from inside of the Tool's run method.
CrunchTool() - Constructor for class org.apache.crunch.util.CrunchTool
CrunchTool(boolean) - Constructor for class org.apache.crunch.util.CrunchTool

D

DataBaseSource<T extends org.apache.hadoop.mapreduce.lib.db.DBWritable & org.apache.hadoop.io.Writable> - Class in org.apache.crunch.contrib.io.jdbc: Source from reading from a database via a JDBC connection.
DataBaseSource.Builder<T extends org.apache.hadoop.mapreduce.lib.db.DBWritable & org.apache.hadoop.io.Writable> - Class in org.apache.crunch.contrib.io.jdbc
DebugLogging - Class in org.apache.crunch.test: Allows direct manipulation of the Hadoop log4j settings to aid in unit testing.
DeepCopier<T> - Interface in org.apache.crunch.types: Performs deep copies of values.
deepCopy(Object) - Method in class org.apache.crunch.types.avro.AvroDerivedValueDeepCopier
deepCopy(Collection<T>) - Method in class org.apache.crunch.types.CollectionDeepCopier
deepCopy(T) - Method in interface org.apache.crunch.types.DeepCopier: Create a deep copy of a value.
deepCopy(Map<String, T>) - Method in class org.apache.crunch.types.MapDeepCopier
deepCopy(T) - Method in class org.apache.crunch.types.NoOpDeepCopier
deepCopy(T) - Method in class org.apache.crunch.types.TupleDeepCopier
deepCopy(Union) - Method in class org.apache.crunch.types.UnionDeepCopier
deepCopy(T) - Method in class org.apache.crunch.types.writable.WritableDeepCopier
DEFAULT - Static variable in class org.apache.crunch.CachingOptions: An instance of CachingOptions with the default caching settings.
DEFAULT_BYTES_PER_REDUCE_TASK - Static variable in class org.apache.crunch.util.PartitionUtils
DEFAULT_MAX_REDUCERS - Static variable in class org.apache.crunch.util.PartitionUtils
DEFAULT_PATH - Static variable in class org.apache.crunch.lib.sort.TotalOrderPartitioner
DefaultJoinStrategy<K,U,V> - Class in org.apache.crunch.lib.join: Default join strategy that simply sends all data through the map, shuffle, and reduce phase.
DefaultJoinStrategy() - Constructor for class org.apache.crunch.lib.join.DefaultJoinStrategy
DefaultJoinStrategy(int) - Constructor for class org.apache.crunch.lib.join.DefaultJoinStrategy
DelegatingReadableData<S,T> - Class in org.apache.crunch.util: Implements the ReadableData<T> interface by delegating to an ReadableData<S> instance and passing its contents through a DoFn<S, T>.
DelegatingReadableData(ReadableData<S>, DoFn<S, T>) - Constructor for class org.apache.crunch.util.DelegatingReadableData
delete() - Method in class org.apache.crunch.test.TemporaryPath
delimiter(String) - Method in class org.apache.crunch.contrib.text.TokenizerFactory.Builder: Sets the delimiter used by the TokenizerFactory instances constructed by this instance.
department - Variable in class org.apache.crunch.test.Employee: Deprecated.
dependsOn(String, Target) - Method in class org.apache.crunch.PipelineCallable: Requires that the given Target exists before this instance may be executed.
dependsOn(String, PCollection<?>) - Method in class org.apache.crunch.PipelineCallable: Requires that the given PCollection be materialized to disk before this instance may be executed.
derived(PType<V1>, PType<V2>, PType<V3>) - Static method in class org.apache.crunch.Tuple3.Collect
derived(PType<V1>, PType<V2>, PType<V3>, PType<V4>) - Static method in class org.apache.crunch.Tuple4.Collect
derived(Class<T>, MapFn<S, T>, MapFn<T, S>, PType<S>) - Static method in class org.apache.crunch.types.avro.Avros
derived(Class<T>, MapFn<S, T>, MapFn<T, S>, PType<S>) - Method in class org.apache.crunch.types.avro.AvroTypeFamily
derived(Class<T>, MapFn<S, T>, MapFn<T, S>, PType<S>) - Method in interface org.apache.crunch.types.PTypeFamily
derived(Class<T>, MapFn<S, T>, MapFn<T, S>, PType<S>) - Static method in class org.apache.crunch.types.writable.Writables
derived(Class<T>, MapFn<S, T>, MapFn<T, S>, PType<S>) - Method in class org.apache.crunch.types.writable.WritableTypeFamily
derivedImmutable(Class<T>, MapFn<S, T>, MapFn<T, S>, PType<S>) - Static method in class org.apache.crunch.types.avro.Avros
derivedImmutable(Class<T>, MapFn<S, T>, MapFn<T, S>, PType<S>) - Method in class org.apache.crunch.types.avro.AvroTypeFamily
derivedImmutable(Class<T>, MapFn<S, T>, MapFn<T, S>, PType<S>) - Method in interface org.apache.crunch.types.PTypeFamily: A derived type whose values are immutable.
derivedImmutable(Class<T>, MapFn<S, T>, MapFn<T, S>, PType<S>) - Static method in class org.apache.crunch.types.writable.Writables
derivedImmutable(Class<T>, MapFn<S, T>, MapFn<T, S>, PType<S>) - Method in class org.apache.crunch.types.writable.WritableTypeFamily
deserialize(String, byte[]) - Method in class org.apache.crunch.kafka.KafkaSource.BytesDeserializer
deserialized(boolean) - Method in class org.apache.crunch.CachingOptions.Builder
deserialized() - Method in class org.apache.crunch.CachingOptions: Whether the data should remain deserialized in the cache, which trades off CPU processing time for additional storage overhead.
detach(DoFn<Pair<K, Iterable<V>>, T>, PType<V>) - Static method in class org.apache.crunch.lib.DoFns: "Reduce" DoFn wrapper which detaches the values in the iterable, preventing the unexpected behaviour related to object reuse often observed when using Avro.
difference(PCollection<T>, PCollection<T>) - Static method in class org.apache.crunch.lib.Set: Compute the set difference between two sets of elements.
disableDeepCopy() - Method in class org.apache.crunch.DoFn: By default, Crunch will do a defensive deep copy of the outputs of a DoFn when there are multiple downstream consumers of that item, in order to prevent the downstream functions from making concurrent modifications to data objects.
DIST_CACHE_REPLICATION - Static variable in class org.apache.crunch.util.DistCache: Configuration key for setting the replication factor for files distributed using the Crunch DistCache helper class.
DistCache - Class in org.apache.crunch.util: Provides functions for working with Hadoop's distributed cache.
DistCache() - Constructor for class org.apache.crunch.util.DistCache
Distinct - Class in org.apache.crunch.lib: Functions for computing the distinct elements of a PCollection.
distinct(PCollection<S>) - Static method in class org.apache.crunch.lib.Distinct: Construct a new PCollection that contains the unique elements of a given input PCollection.
distinct(PTable<K, V>) - Static method in class org.apache.crunch.lib.Distinct: A PTable<K, V> analogue of the distinct function.
distinct(PCollection<S>, int) - Static method in class org.apache.crunch.lib.Distinct: A distinct operation that gives the client more control over how frequently elements are flushed to disk in order to allow control over performance or memory consumption.
distinct(PTable<K, V>, int) - Static method in class org.apache.crunch.lib.Distinct: A PTable<K, V> analogue of the distinct function.
distributed(PTable<K, V>, double, double...) - Static method in class org.apache.crunch.lib.Quantiles: Calculate a set of quantiles for each key in a numerically-valued table.
DistributedPipeline - Class in org.apache.crunch.impl.dist
DistributedPipeline(String, Configuration, PCollectionFactory) - Constructor for class org.apache.crunch.impl.dist.DistributedPipeline: Instantiate with a custom name and configuration.
DoCollection<S> - Class in org.apache.crunch.impl.spark.collect
DoFn<S,T> - Class in org.apache.crunch: Base class for all data processing functions in Crunch.
DoFn() - Constructor for class org.apache.crunch.DoFn
DoFnIterator<S,T> - Class in org.apache.crunch.util: An Iterator<T> that combines a delegate Iterator<S> and a DoFn<S, T>, generating data by passing the contents of the iterator through the function.
DoFnIterator(Iterator<S>, DoFn<S, T>) - Constructor for class org.apache.crunch.util.DoFnIterator
DoFns - Class in org.apache.crunch.lib
DoFns() - Constructor for class org.apache.crunch.lib.DoFns
done() - Method in class org.apache.crunch.impl.dist.DistributedPipeline
done() - Method in class org.apache.crunch.impl.mem.MemPipeline
done() - Method in class org.apache.crunch.impl.spark.SparkPipeline
done() - Method in interface org.apache.crunch.Pipeline: Run any remaining jobs required to generate outputs and then clean up any intermediate data files that were created in this run or previous calls to run.
DONE - Static variable in class org.apache.crunch.PipelineResult
done() - Method in class org.apache.crunch.util.CrunchTool
DoTable<K,V> - Class in org.apache.crunch.impl.spark.collect
doubles() - Static method in class org.apache.crunch.types.avro.Avros
doubles() - Method in class org.apache.crunch.types.avro.AvroTypeFamily
doubles() - Method in interface org.apache.crunch.types.PTypeFamily
doubles() - Static method in class org.apache.crunch.types.writable.Writables
doubles() - Method in class org.apache.crunch.types.writable.WritableTypeFamily
drop(Integer...) - Method in class org.apache.crunch.contrib.text.TokenizerFactory.Builder: Drop the specified fields found by the input scanner, counting from zero.

E

element() - Method in interface org.apache.crunch.lambda.LDoFnContext: Get the input element
emit(T) - Method in interface org.apache.crunch.Emitter: Write the emitted value to the next stage of the pipeline.
emit(T) - Method in interface org.apache.crunch.lambda.LDoFnContext: Emit t to the output
Emitter<T> - Interface in org.apache.crunch: Interface for writing outputs from a DoFn.
Employee - Class in org.apache.crunch.test
Employee() - Constructor for class org.apache.crunch.test.Employee: Default constructor.
Employee(CharSequence, Integer, CharSequence) - Constructor for class org.apache.crunch.test.Employee: All-args constructor.
Employee.Builder - Class in org.apache.crunch.test: RecordBuilder for Employee instances.
EMPTY - Static variable in class org.apache.crunch.PipelineResult
EmptyPCollection<T> - Class in org.apache.crunch.impl.dist.collect
EmptyPCollection(DistributedPipeline, PType<T>) - Constructor for class org.apache.crunch.impl.dist.collect.EmptyPCollection
emptyPCollection(PType<S>) - Method in class org.apache.crunch.impl.dist.DistributedPipeline
emptyPCollection(PType<T>) - Method in class org.apache.crunch.impl.mem.MemPipeline
EmptyPCollection<T> - Class in org.apache.crunch.impl.spark.collect
EmptyPCollection(DistributedPipeline, PType<T>) - Constructor for class org.apache.crunch.impl.spark.collect.EmptyPCollection
emptyPCollection(PType<S>) - Method in class org.apache.crunch.impl.spark.SparkPipeline
emptyPCollection(PType<T>) - Method in interface org.apache.crunch.Pipeline: Creates an empty PCollection of the given PType.
EmptyPTable<K,V> - Class in org.apache.crunch.impl.dist.collect
EmptyPTable(DistributedPipeline, PTableType<K, V>) - Constructor for class org.apache.crunch.impl.dist.collect.EmptyPTable
emptyPTable(PTableType<K, V>) - Method in class org.apache.crunch.impl.dist.DistributedPipeline
emptyPTable(PTableType<K, V>) - Method in class org.apache.crunch.impl.mem.MemPipeline
EmptyPTable<K,V> - Class in org.apache.crunch.impl.spark.collect
EmptyPTable(DistributedPipeline, PTableType<K, V>) - Constructor for class org.apache.crunch.impl.spark.collect.EmptyPTable
emptyPTable(PTableType<K, V>) - Method in class org.apache.crunch.impl.spark.SparkPipeline
emptyPTable(PTableType<K, V>) - Method in interface org.apache.crunch.Pipeline: Creates an empty PTable of the given PTable Type.
enable(Level) - Static method in class org.apache.crunch.test.DebugLogging: Enables logging Hadoop output to the console using the pattern '%-4r [%t] %-5p %c %x - %m%n' at the specified Level.
enable(Level, Appender) - Static method in class org.apache.crunch.test.DebugLogging: Enables logging to the given Appender at the specified Level.
enableDebug() - Method in class org.apache.crunch.impl.dist.DistributedPipeline
enableDebug() - Method in class org.apache.crunch.impl.mem.MemPipeline
enableDebug() - Method in interface org.apache.crunch.Pipeline: Turn on debug logging for jobs that are run from this pipeline.
enableDebug() - Method in class org.apache.crunch.util.CrunchTool
enums(Class<T>, PTypeFamily) - Static method in class org.apache.crunch.types.PTypes: Constructs a PType for a Java Enum type.
equals(Object) - Method in class org.apache.crunch.impl.dist.collect.BaseInputCollection
equals(Object) - Method in class org.apache.crunch.impl.dist.collect.BaseInputTable
equals(Object) - Method in class org.apache.crunch.impl.spark.ByteArray
equals(Object) - Method in class org.apache.crunch.impl.spark.IntByteArray
equals(Object) - Method in class org.apache.crunch.io.FormatBundle
equals(Object) - Method in class org.apache.crunch.kafka.offset.hdfs.Offsets
equals(Object) - Method in class org.apache.crunch.kafka.offset.hdfs.Offsets.PartitionOffset
equals(Object) - Method in class org.apache.crunch.lib.Quantiles.Result
equals(Object) - Method in class org.apache.crunch.Pair
equals(Object) - Method in class org.apache.crunch.Tuple3
equals(Object) - Method in class org.apache.crunch.Tuple4
equals(Object) - Method in class org.apache.crunch.TupleN
equals(Object) - Method in class org.apache.crunch.types.avro.AvroMode
equals(Object) - Method in class org.apache.crunch.types.avro.AvroType
equals(Object) - Method in class org.apache.crunch.types.writable.TupleWritable
equals(Object) - Method in class org.apache.crunch.types.writable.WritableType
equals(Object) - Method in class org.apache.crunch.Union
errorOnLastRecord() - Method in class org.apache.crunch.contrib.text.AbstractCompositeExtractor
errorOnLastRecord() - Method in class org.apache.crunch.contrib.text.AbstractSimpleExtractor
errorOnLastRecord() - Method in interface org.apache.crunch.contrib.text.Extractor: Returns true if the last call to extract on this instance threw an exception that was handled.
execute() - Method in class org.apache.crunch.impl.spark.SparkRuntime
extract(String) - Method in class org.apache.crunch.contrib.text.AbstractCompositeExtractor
extract(String) - Method in class org.apache.crunch.contrib.text.AbstractSimpleExtractor
extract(String) - Method in interface org.apache.crunch.contrib.text.Extractor: Extract a value with the type of this instance.
extractKey(String) - Static method in class org.apache.crunch.types.Protos
ExtractKeyFn<K,V> - Class in org.apache.crunch.fn: Wrapper function for converting a key-from-value extractor MapFn<V, K> into a key-value pair extractor that is used to convert from a PCollection<V> to a PTable<K, V>.
ExtractKeyFn(MapFn<V, K>) - Constructor for class org.apache.crunch.fn.ExtractKeyFn
Extractor<T> - Interface in org.apache.crunch.contrib.text: An interface for extracting a specific data type from a text string that is being processed by a Scanner object.
Extractors - Class in org.apache.crunch.contrib.text: Factory methods for constructing common Extractor types.
Extractors() - Constructor for class org.apache.crunch.contrib.text.Extractors
ExtractorStats - Class in org.apache.crunch.contrib.text: Records the number of kind of errors that an Extractor encountered when parsing input data.
ExtractorStats(int) - Constructor for class org.apache.crunch.contrib.text.ExtractorStats
ExtractorStats(int, List<Integer>) - Constructor for class org.apache.crunch.contrib.text.ExtractorStats
extractText(PTable<ImmutableBytesWritable, Result>) - Method in class org.apache.crunch.examples.WordAggregationHBase: Extract information from hbase

F

factory() - Method in interface org.apache.crunch.lambda.LCollection: Get the LCollectionFactory which can be used to create new Ltype instances
FILE_FORMAT_EXTENSION - Static variable in class org.apache.crunch.kafka.offset.hdfs.HDFSOffsetWriter: File extension for storing the offsets.
FILE_FORMATTER - Static variable in class org.apache.crunch.kafka.offset.hdfs.HDFSOffsetWriter: Formatter to use when creating the file names in a URI compliant format.
fileNameToPersistenceTime(String) - Static method in class org.apache.crunch.kafka.offset.hdfs.HDFSOffsetWriter: Converts a fileName into the time the offsets were persisted.
FileNamingScheme - Interface in org.apache.crunch.io: Encapsulates rules for naming output files.
FileReaderFactory<T> - Interface in org.apache.crunch.io
filter(FilterFn<S>) - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
filter(String, FilterFn<S>) - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
filter(FilterFn<Pair<K, V>>) - Method in class org.apache.crunch.impl.dist.collect.PTableBase
filter(String, FilterFn<Pair<K, V>>) - Method in class org.apache.crunch.impl.dist.collect.PTableBase
filter(SPredicate<S>) - Method in interface org.apache.crunch.lambda.LCollection: Filter the collection using the supplied predicate.
filter(SPredicate<Pair<K, V>>) - Method in interface org.apache.crunch.lambda.LTable: Filter the rows of the table using the supplied predicate.
filter(FilterFn<S>) - Method in interface org.apache.crunch.PCollection: Apply the given filter function to this instance and return the resulting PCollection.
filter(String, FilterFn<S>) - Method in interface org.apache.crunch.PCollection: Apply the given filter function to this instance and return the resulting PCollection.
filter(FilterFn<Pair<K, V>>) - Method in interface org.apache.crunch.PTable: Apply the given filter function to this instance and return the resulting PTable.
filter(String, FilterFn<Pair<K, V>>) - Method in interface org.apache.crunch.PTable: Apply the given filter function to this instance and return the resulting PTable.
filterByKey(SPredicate<K>) - Method in interface org.apache.crunch.lambda.LTable: Filter the rows of the table using the supplied predicate applied to the key part of each record.
filterByValue(SPredicate<V>) - Method in interface org.apache.crunch.lambda.LTable: Filter the rows of the table using the supplied predicate applied to the value part of each record.
filterConnectionProperties(Properties) - Static method in class org.apache.crunch.kafka.inputformat.KafkaInputFormat: Filters out Kafka connection properties that were tagged using generateConnectionPropertyKey.
FilterFn<T> - Class in org.apache.crunch: A DoFn for the common case of filtering the members of a PCollection based on a boolean condition.
FilterFn() - Constructor for class org.apache.crunch.FilterFn
FilterFns - Class in org.apache.crunch.fn: A collection of pre-defined FilterFn implementations.
filterMap(SFunction<S, Optional<T>>, PType<T>) - Method in interface org.apache.crunch.lambda.LCollection: Combination of a filter and map operation by using a function with Optional return type.
filterMap(SFunction<S, Optional<Pair<K, V>>>, PTableType<K, V>) - Method in interface org.apache.crunch.lambda.LCollection: Combination of a filter and map operation by using a function with Optional return type.
findContainingJar(Class<?>) - Static method in class org.apache.crunch.util.DistCache: Finds the path to a jar that contains the class provided, if any.
findCounter(Enum<?>) - Method in class org.apache.crunch.PipelineResult.StageResult: Deprecated.
The Counter class changed incompatibly between Hadoop 1 and 2 (from a class to an interface) so user programs should avoid this method and use PipelineResult.StageResult.getCounterValue(Enum) and/or PipelineResult.StageResult.getCounterDisplayName(Enum).
findPartition(K) - Method in class org.apache.crunch.lib.sort.TotalOrderPartitioner.BinarySearchNode
findPartition(T) - Method in interface org.apache.crunch.lib.sort.TotalOrderPartitioner.Node: Locate partition in keyset K, st [Ki..Ki+1) defines a partition, with implicit K0 = -inf, Kn = +inf, and |K| = #partitions - 1.
first() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
first() - Method in class org.apache.crunch.Pair
first() - Method in interface org.apache.crunch.PCollection
first() - Method in class org.apache.crunch.Tuple3
first() - Method in class org.apache.crunch.Tuple4
FIRST_N(int) - Static method in class org.apache.crunch.fn.Aggregators: Return the first n values (or fewer if there are fewer values than n).
flatMap(SFunction<S, Stream<T>>, PType<T>) - Method in interface org.apache.crunch.lambda.LCollection: Map each element to zero or more output elements using the provided stream-returning function.
flatMap(SFunction<S, Stream<Pair<K, V>>>, PTableType<K, V>) - Method in interface org.apache.crunch.lambda.LCollection: Map each element to zero or more output elements using the provided stream-returning function to yield an LTable
FlatMapIndexFn<S,T> - Class in org.apache.crunch.impl.spark.fn
FlatMapIndexFn(DoFn<S, T>, boolean, SparkRuntimeContext) - Constructor for class org.apache.crunch.impl.spark.fn.FlatMapIndexFn
FlatMapPairDoFn<K,V,T> - Class in org.apache.crunch.impl.spark.fn
FlatMapPairDoFn(DoFn<Pair<K, V>, T>, SparkRuntimeContext) - Constructor for class org.apache.crunch.impl.spark.fn.FlatMapPairDoFn
floats() - Static method in class org.apache.crunch.types.avro.Avros
floats() - Method in class org.apache.crunch.types.avro.AvroTypeFamily
floats() - Method in interface org.apache.crunch.types.PTypeFamily
floats() - Static method in class org.apache.crunch.types.writable.Writables
floats() - Method in class org.apache.crunch.types.writable.WritableTypeFamily
flush() - Method in interface org.apache.crunch.Emitter: Flushes any values cached by this emitter.
forAvroSchema(Schema) - Static method in class org.apache.crunch.impl.spark.ByteArrayHelper
forInput(Class<T>) - Static method in class org.apache.crunch.io.FormatBundle
FormatBundle<K> - Class in org.apache.crunch.io: A combination of an InputFormat or OutputFormat and any extra configuration information that format class needs to run.
FormatBundle() - Constructor for class org.apache.crunch.io.FormatBundle
formattedFile(String, Class<? extends FileInputFormat<K, V>>, Class<K>, Class<V>) - Static method in class org.apache.crunch.io.From: Creates a TableSource<K, V> for reading data from files that have custom FileInputFormat<K, V> implementations not covered by the provided TableSource and Source factory methods.
formattedFile(Path, Class<? extends FileInputFormat<K, V>>, Class<K>, Class<V>) - Static method in class org.apache.crunch.io.From: Creates a TableSource<K, V> for reading data from files that have custom FileInputFormat<K, V> implementations not covered by the provided TableSource and Source factory methods.
formattedFile(List<Path>, Class<? extends FileInputFormat<K, V>>, Class<K>, Class<V>) - Static method in class org.apache.crunch.io.From: Creates a TableSource<K, V> for reading data from files that have custom FileInputFormat<K, V> implementations not covered by the provided TableSource and Source factory methods.
formattedFile(String, Class<? extends FileInputFormat<?, ?>>, PType<K>, PType<V>) - Static method in class org.apache.crunch.io.From: Creates a TableSource<K, V> for reading data from files that have custom FileInputFormat implementations not covered by the provided TableSource and Source factory methods.
formattedFile(Path, Class<? extends FileInputFormat<?, ?>>, PType<K>, PType<V>) - Static method in class org.apache.crunch.io.From: Creates a TableSource<K, V> for reading data from files that have custom FileInputFormat implementations not covered by the provided TableSource and Source factory methods.
formattedFile(List<Path>, Class<? extends FileInputFormat<?, ?>>, PType<K>, PType<V>) - Static method in class org.apache.crunch.io.From: Creates a TableSource<K, V> for reading data from files that have custom FileInputFormat implementations not covered by the provided TableSource and Source factory methods.
formattedFile(String, Class<? extends FileOutputFormat<K, V>>) - Static method in class org.apache.crunch.io.To: Creates a Target at the given path name that writes data to a custom FileOutputFormat.
formattedFile(Path, Class<? extends FileOutputFormat<K, V>>) - Static method in class org.apache.crunch.io.To: Creates a Target at the given Path that writes data to a custom FileOutputFormat.
forOutput(Class<T>) - Static method in class org.apache.crunch.io.FormatBundle
fourth() - Method in class org.apache.crunch.Tuple4
From - Class in org.apache.crunch.io: Static factory methods for creating common Source types.
From() - Constructor for class org.apache.crunch.io.From
fromBytes(byte[]) - Method in class org.apache.crunch.impl.spark.serde.AvroSerDe
fromBytes(byte[]) - Method in interface org.apache.crunch.impl.spark.serde.SerDe
fromBytes(byte[]) - Method in class org.apache.crunch.impl.spark.serde.WritableSerDe
fromBytesFunction() - Method in class org.apache.crunch.impl.spark.serde.AvroSerDe
fromBytesFunction() - Method in interface org.apache.crunch.impl.spark.serde.SerDe
fromBytesFunction() - Method in class org.apache.crunch.impl.spark.serde.WritableSerDe
fromConfiguration(Configuration) - Static method in class org.apache.crunch.types.avro.AvroMode: Creates an AvroMode based on the AvroMode.AVRO_MODE_PROPERTY property in the conf.
fromSerialized(String, Configuration) - Static method in class org.apache.crunch.io.FormatBundle
fromShuffleConfiguration(Configuration) - Static method in class org.apache.crunch.types.avro.AvroMode: Creates an AvroMode based on the AvroMode.AVRO_SHUFFLE_MODE_PROPERTY property in the conf.
fromType(AvroType<?>) - Static method in class org.apache.crunch.types.avro.AvroMode: Creates an AvroMode based upon the specified type.
fullJoin(PTable<K, U>, PTable<K, V>) - Static method in class org.apache.crunch.lib.Join: Performs a full outer join on the specified PTables.
FullOuterJoinFn<K,U,V> - Class in org.apache.crunch.lib.join: Used to perform the last step of an full outer join.
FullOuterJoinFn(PType<K>, PType) - Constructor for class org.apache.crunch.lib.join.FullOuterJoinFn

G

generateKeys(S) - Method in class org.apache.crunch.contrib.bloomfilter.BloomFilterFn
generateOutput(Pipeline) - Method in class org.apache.crunch.PipelineCallable: Called by the Pipeline when this instance is registered with Pipeline#sequentialDo.
GENERIC - Static variable in class org.apache.crunch.types.avro.AvroMode: Default mode to use for reading and writing Generic types.
generics(Schema) - Static method in class org.apache.crunch.types.avro.Avros
generics(Schema) - Method in class org.apache.crunch.types.avro.AvroTypeFamily
get() - Method in class org.apache.crunch.impl.spark.SparkRuntime
get(long, TimeUnit) - Method in class org.apache.crunch.impl.spark.SparkRuntime
get(int) - Method in class org.apache.crunch.Pair
get(int) - Method in class org.apache.crunch.test.Employee
get(int) - Method in class org.apache.crunch.test.Person
get(int) - Method in interface org.apache.crunch.Tuple: Returns the Object at the given index.
get(int) - Method in class org.apache.crunch.Tuple3
get(int) - Method in class org.apache.crunch.Tuple4
get(int) - Method in class org.apache.crunch.TupleN
get(int) - Method in class org.apache.crunch.types.writable.TupleWritable: Get ith Writable from Tuple.
getAge() - Method in class org.apache.crunch.test.Person.Builder: Gets the value of the 'age' field
getAge() - Method in class org.apache.crunch.test.Person: Gets the value of the 'age' field.
getAllPCollections() - Method in class org.apache.crunch.PipelineCallable: Returns the mapping of labels to PCollection dependencies for this instance.
getAllStructFieldRefs() - Method in class org.apache.crunch.types.orc.TupleObjectInspector
getAllTargets() - Method in class org.apache.crunch.PipelineCallable: Returns the mapping of labels to Target dependencies for this instance.
getAsOfTime() - Method in class org.apache.crunch.kafka.offset.hdfs.Offsets: Returns the time in milliseconds since epoch that the offset information was retrieved or valid as of.
getBrokerOffsets(Properties, long, String...) - Static method in class org.apache.crunch.kafka.KafkaUtils: Retrieves the offset values for an array of topics at the specified time.
getByFn() - Method in class org.apache.crunch.lib.sort.SortFns.KeyExtraction
getCategory() - Method in class org.apache.crunch.types.orc.TupleObjectInspector
getClassSchema() - Static method in class org.apache.crunch.test.Employee
getClassSchema() - Static method in class org.apache.crunch.test.Person
getCombineFn() - Method in class org.apache.crunch.impl.spark.SparkRuntime
getCompletionHooks() - Method in class org.apache.crunch.impl.mr.MRPipeline
getConf() - Method in class org.apache.crunch.io.FormatBundle
getConf() - Method in class org.apache.crunch.kafka.inputformat.KafkaInputFormat
getConf() - Method in class org.apache.crunch.lib.sort.TotalOrderPartitioner
getConf() - Method in class org.apache.crunch.lib.sort.TupleWritableComparator
getConf() - Method in class org.apache.crunch.types.writable.TupleWritable.Comparator
getConf() - Method in class org.apache.crunch.util.CrunchTool
getConfiguration() - Method in class org.apache.crunch.impl.dist.DistributedPipeline
getConfiguration() - Method in class org.apache.crunch.impl.mem.MemPipeline
getConfiguration() - Method in class org.apache.crunch.impl.spark.SparkRuntime
getConfiguration() - Method in class org.apache.crunch.impl.spark.SparkRuntimeContext
getConfiguration() - Method in interface org.apache.crunch.lambda.LDoFnContext: Get the current Hadoop Configuration
getConfiguration() - Method in interface org.apache.crunch.Pipeline: Returns the Configuration instance associated with this pipeline.
getContext() - Method in interface org.apache.crunch.lambda.LDoFnContext: Get the underlying TaskInputOutputContext (for special cases)
getConverter() - Method in class org.apache.crunch.kafka.KafkaSource
getConverter() - Method in interface org.apache.crunch.Source: Returns the Converter used for mapping the inputs from this instance into PCollection or PTable values.
getConverter(PType<?>) - Method in interface org.apache.crunch.Target: Returns the Converter to use for mapping from the output PCollection into the output values expected by this instance.
getConverter() - Method in class org.apache.crunch.types.avro.AvroType
getConverter() - Method in class org.apache.crunch.types.PGroupedTableType
getConverter() - Method in interface org.apache.crunch.types.PType
getConverter() - Method in class org.apache.crunch.types.writable.WritableType
getCounter(Enum<?>) - Static method in class org.apache.crunch.test.TestCounters
getCounter(String, String) - Static method in class org.apache.crunch.test.TestCounters
getCounter() - Method in class org.apache.hadoop.mapred.SparkCounter
getCounterDisplayName(String, String) - Method in class org.apache.crunch.PipelineResult.StageResult
getCounterDisplayName(Enum<?>) - Method in class org.apache.crunch.PipelineResult.StageResult
getCounterNames() - Method in class org.apache.crunch.PipelineResult.StageResult
getCounters() - Static method in class org.apache.crunch.impl.mem.MemPipeline
getCounters() - Method in class org.apache.crunch.PipelineResult.StageResult: Deprecated.
The Counter class changed incompatibly between Hadoop 1 and 2 (from a class to an interface) so user programs should avoid this method and use PipelineResult.StageResult.getCounterNames().
getCounterValue(String, String) - Method in class org.apache.crunch.PipelineResult.StageResult
getCounterValue(Enum<?>) - Method in class org.apache.crunch.PipelineResult.StageResult
getCurrentKey() - Method in class org.apache.crunch.kafka.inputformat.KafkaRecordReader
getCurrentValue() - Method in class org.apache.crunch.kafka.inputformat.KafkaRecordReader
getData() - Method in class org.apache.crunch.types.avro.AvroMode: Returns a GenericData instance based on the mode type.
getData() - Method in interface org.apache.crunch.types.avro.ReaderWriterFactory
getData() - Method in class org.apache.crunch.types.avro.ReflectDataFactory
getDataFileWriter(Path, Configuration) - Static method in class org.apache.crunch.types.avro.AvroOutputFormat
getDefaultConfiguration() - Method in class org.apache.crunch.test.TemporaryPath
getDefaultFileSource(Path) - Method in class org.apache.crunch.types.avro.AvroType
getDefaultFileSource(Path) - Method in class org.apache.crunch.types.PGroupedTableType
getDefaultFileSource(Path) - Method in interface org.apache.crunch.types.PType: Returns a SourceTarget that is able to read/write data using the serialization format specified by this PType.
getDefaultFileSource(Path) - Method in class org.apache.crunch.types.writable.WritableType
getDefaultInstance() - Static method in class org.apache.crunch.contrib.text.TokenizerFactory: Returns a default TokenizerFactory that uses whitespace as a delimiter and does not skip any input fields.
getDefaultInstance(Class<M>) - Static method in class org.apache.crunch.types.Protos: Utility function for creating a default PB Messgae from a Class object that works with both protoc 2.3.0 and 2.4.x.
getDefaultValue() - Method in class org.apache.crunch.contrib.text.AbstractSimpleExtractor
getDefaultValue() - Method in interface org.apache.crunch.contrib.text.Extractor: Returns the default value for this Extractor in case of an error.
getDepartment() - Method in class org.apache.crunch.test.Employee.Builder: Gets the value of the 'department' field
getDepartment() - Method in class org.apache.crunch.test.Employee: Gets the value of the 'department' field.
getDependentJobs() - Method in interface org.apache.crunch.impl.mr.MRJob
getDepth() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
getDetachedValue(PTableType<K, V>, Pair<K, V>) - Static method in class org.apache.crunch.lib.PTables: Create a detached value for a table Pair.
getDetachedValue(T) - Method in class org.apache.crunch.types.avro.AvroType
getDetachedValue(T) - Method in interface org.apache.crunch.types.PType: Returns a copy of a value (or the value itself) that can safely be retained.
getDetachedValue(T) - Method in class org.apache.crunch.types.writable.WritableType
getDisplayName() - Method in class org.apache.hadoop.mapred.SparkCounter
getEndingOffset() - Method in class org.apache.crunch.kafka.inputformat.KafkaInputSplit: Returns the ending offset for the split
getEndTimeMsec() - Method in class org.apache.crunch.PipelineResult.StageResult
getErrorCount() - Method in class org.apache.crunch.contrib.text.ExtractorStats: The overall number of records that had some kind of parsing error.
getFactory() - Method in class org.apache.crunch.impl.dist.DistributedPipeline
getFactory() - Method in class org.apache.crunch.types.avro.AvroMode: Returns the factory that will be used for the mode.
getFamily() - Method in class org.apache.crunch.types.avro.AvroType
getFamily() - Method in class org.apache.crunch.types.PGroupedTableType
getFamily() - Method in interface org.apache.crunch.types.PType: Returns the PTypeFamily that this PType belongs to.
getFamily() - Method in class org.apache.crunch.types.writable.WritableType
getFieldErrors() - Method in class org.apache.crunch.contrib.text.ExtractorStats: Returns the number of errors that occurred when parsing the individual fields of a composite record type, like a Pair or TupleN.
getFile(String) - Method in class org.apache.crunch.test.TemporaryPath: Get a File below the temporary directory.
getFileName(String) - Method in class org.apache.crunch.test.TemporaryPath: Get an absolute file name below the temporary directory.
getFileNamingScheme() - Method in interface org.apache.crunch.io.PathTarget: Get the naming scheme to be used for outputs being written to an output path.
getFirst() - Method in class org.apache.crunch.fn.CompositeMapFn
getFormatClass() - Method in class org.apache.crunch.io.FormatBundle
getFormatNodeMap(JobContext) - Static method in class org.apache.crunch.io.CrunchInputs
getGroupedDetachedValue(PGroupedTableType<K, V>, Pair<K, Iterable<V>>) - Static method in class org.apache.crunch.lib.PTables: Created a detached value for a PGroupedTable value.
getGroupedTableType() - Method in class org.apache.crunch.impl.dist.collect.BaseGroupedTable
getGroupedTableType() - Method in interface org.apache.crunch.PGroupedTable: Return the PGroupedTableType containing serialization information for this PGroupedTable.
getGroupedTableType() - Method in interface org.apache.crunch.types.PTableType: Returns the grouped table version of this type.
getGroupingComparator(PTypeFamily) - Static method in class org.apache.crunch.lib.join.JoinUtils
getGroupingComparatorClass() - Method in class org.apache.crunch.GroupingOptions
getGroupingConverter() - Method in class org.apache.crunch.types.PGroupedTableType
getIndex() - Method in class org.apache.crunch.types.writable.UnionWritable
getIndex() - Method in class org.apache.crunch.Union: Returns the index of the original data source for this union type.
getInputMapFn() - Method in class org.apache.crunch.types.avro.AvroType
getInputMapFn() - Method in interface org.apache.crunch.types.PType
getInputMapFn() - Method in class org.apache.crunch.types.writable.WritableType
getInstance() - Static method in class org.apache.crunch.fn.IdentityFn
getInstance() - Static method in class org.apache.crunch.impl.mem.MemPipeline
getInstance() - Static method in class org.apache.crunch.io.SequentialFileNamingScheme
getInstance() - Static method in class org.apache.crunch.types.avro.AvroTypeFamily
getInstance() - Static method in class org.apache.crunch.types.writable.TupleWritable.Comparator
getInstance() - Static method in class org.apache.crunch.types.writable.WritableTypeFamily
getJavaRDDLike(SparkRuntime) - Method in class org.apache.crunch.impl.spark.collect.CreatedCollection
getJavaRDDLike(SparkRuntime) - Method in class org.apache.crunch.impl.spark.collect.CreatedTable
getJavaRDDLike(SparkRuntime) - Method in class org.apache.crunch.impl.spark.collect.DoCollection
getJavaRDDLike(SparkRuntime) - Method in class org.apache.crunch.impl.spark.collect.DoTable
getJavaRDDLike(SparkRuntime) - Method in class org.apache.crunch.impl.spark.collect.EmptyPCollection
getJavaRDDLike(SparkRuntime) - Method in class org.apache.crunch.impl.spark.collect.EmptyPTable
getJavaRDDLike(SparkRuntime) - Method in class org.apache.crunch.impl.spark.collect.InputCollection
getJavaRDDLike(SparkRuntime) - Method in class org.apache.crunch.impl.spark.collect.InputTable
getJavaRDDLike(SparkRuntime) - Method in class org.apache.crunch.impl.spark.collect.PGroupedTableImpl
getJavaRDDLike(SparkRuntime) - Method in class org.apache.crunch.impl.spark.collect.UnionCollection
getJavaRDDLike(SparkRuntime) - Method in class org.apache.crunch.impl.spark.collect.UnionTable
getJavaRDDLike(SparkRuntime) - Method in interface org.apache.crunch.impl.spark.SparkCollection
getJob() - Method in interface org.apache.crunch.impl.mr.MRJob
getJobEndTimeMsec() - Method in class org.apache.crunch.PipelineResult.StageResult
getJobID() - Method in interface org.apache.crunch.impl.mr.MRJob
getJobs() - Method in interface org.apache.crunch.impl.mr.MRPipelineExecution
getJobStartTimeMsec() - Method in class org.apache.crunch.PipelineResult.StageResult
getJobState() - Method in interface org.apache.crunch.impl.mr.MRJob
getJoinType() - Method in class org.apache.crunch.lib.join.FullOuterJoinFn
getJoinType() - Method in class org.apache.crunch.lib.join.InnerJoinFn
getJoinType() - Method in class org.apache.crunch.lib.join.JoinFn
getJoinType() - Method in class org.apache.crunch.lib.join.LeftOuterJoinFn
getJoinType() - Method in class org.apache.crunch.lib.join.RightOuterJoinFn
getKafkaConnectionProperties(Configuration) - Static method in class org.apache.crunch.kafka.KafkaUtils: Converts the provided config into a Properties object to connect with Kafka.
getKeyClass() - Method in interface org.apache.crunch.types.Converter
getKeyType() - Method in class org.apache.crunch.impl.dist.collect.PTableBase
getKeyType() - Method in class org.apache.crunch.lib.sort.SortFns.KeyExtraction
getKeyType() - Method in interface org.apache.crunch.PTable: Returns the PType of the key.
getKeyType() - Method in interface org.apache.crunch.types.PTableType: Returns the key type for the table.
getLastModifiedAt(Configuration) - Method in class org.apache.crunch.contrib.io.jdbc.DataBaseSource
getLastModifiedAt() - Method in class org.apache.crunch.impl.dist.collect.BaseDoCollection
getLastModifiedAt() - Method in class org.apache.crunch.impl.dist.collect.BaseDoTable
getLastModifiedAt() - Method in class org.apache.crunch.impl.dist.collect.BaseGroupedTable
getLastModifiedAt() - Method in class org.apache.crunch.impl.dist.collect.BaseInputCollection
getLastModifiedAt() - Method in class org.apache.crunch.impl.dist.collect.BaseInputTable
getLastModifiedAt() - Method in class org.apache.crunch.impl.dist.collect.BaseUnionCollection
getLastModifiedAt() - Method in class org.apache.crunch.impl.dist.collect.BaseUnionTable
getLastModifiedAt() - Method in class org.apache.crunch.impl.dist.collect.EmptyPCollection
getLastModifiedAt() - Method in class org.apache.crunch.impl.dist.collect.EmptyPTable
getLastModifiedAt() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl: The time of the most recent modification to one of the input sources to the collection.
getLastModifiedAt() - Method in class org.apache.crunch.impl.spark.collect.CreatedCollection
getLastModifiedAt() - Method in class org.apache.crunch.impl.spark.collect.CreatedTable
getLastModifiedAt(FileSystem, Path) - Static method in class org.apache.crunch.io.SourceTargetHelper
getLastModifiedAt(Configuration) - Method in class org.apache.crunch.kafka.KafkaSource
getLastModifiedAt(Configuration) - Method in interface org.apache.crunch.Source: Returns the time (in milliseconds) that this Source was most recently modified (e.g., because an input file was edited or new files were added to a directory.)
getLength() - Method in class org.apache.crunch.kafka.inputformat.KafkaInputSplit
getLocations() - Method in class org.apache.crunch.kafka.inputformat.KafkaInputSplit
getMapOutputName(Configuration, Path) - Method in interface org.apache.crunch.io.FileNamingScheme: Get the output file name for a map task.
getMapOutputName(Configuration, Path) - Method in class org.apache.crunch.io.SequentialFileNamingScheme
getMaterializedAt() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
getMaterializeSourceTarget(PCollection<T>) - Method in class org.apache.crunch.impl.dist.DistributedPipeline: Retrieve a ReadableSourceTarget that provides access to the contents of a PCollection.
getMessage() - Method in class org.apache.crunch.PipelineCallable: Returns a message associated with this callable's execution, especially in case of errors.
getModeProperties() - Method in class org.apache.crunch.types.avro.AvroMode: Returns the entries that a Configuration instance needs to enable this AvroMode as a serializable map of key-value pairs.
getName() - Method in class org.apache.crunch.CreateOptions
getName() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
getName() - Method in class org.apache.crunch.impl.dist.DistributedPipeline
getName() - Method in class org.apache.crunch.impl.mem.MemPipeline
getName() - Method in class org.apache.crunch.io.FormatBundle
getName() - Method in interface org.apache.crunch.PCollection: Returns a shorthand name for this PCollection.
getName() - Method in interface org.apache.crunch.Pipeline: Returns the name of this pipeline.
getName() - Method in class org.apache.crunch.PipelineCallable: Returns the name of this instance.
getName() - Method in class org.apache.crunch.test.Employee.Builder: Gets the value of the 'name' field
getName() - Method in class org.apache.crunch.test.Employee: Gets the value of the 'name' field.
getName() - Method in class org.apache.crunch.test.Person.Builder: Gets the value of the 'name' field
getName() - Method in class org.apache.crunch.test.Person: Gets the value of the 'name' field.
getName() - Method in class org.apache.hadoop.mapred.SparkCounter
getNamedDotFiles() - Method in class org.apache.crunch.impl.spark.SparkRuntime
getNamedDotFiles() - Method in interface org.apache.crunch.PipelineExecution: Returns all .dot files that allows a client to graph the Crunch execution plan internals.
getNamedOutputs(Configuration) - Static method in class org.apache.crunch.io.CrunchOutputs
getNextAnonymousStageId() - Method in class org.apache.crunch.impl.dist.DistributedPipeline
getNumReducers() - Method in class org.apache.crunch.GroupingOptions
getNumShards(K) - Method in interface org.apache.crunch.lib.join.ShardedJoinStrategy.ShardingStrategy: Retrieve the number of shards over which the given key should be split.
getOffset() - Method in class org.apache.crunch.kafka.offset.hdfs.Offsets.PartitionOffset: Returns the offset
getOffsets(Configuration) - Static method in class org.apache.crunch.kafka.inputformat.KafkaInputFormat: Reads the configuration to determine which topics, partitions, and offsets should be used for reading data.
getOffsets() - Method in class org.apache.crunch.kafka.offset.hdfs.Offsets: The collection of offset information for specific topics and partitions.
getOnlyParent() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
getOutputCommitter(TaskAttemptContext) - Static method in class org.apache.crunch.io.CrunchOutputs
getOutputMapFn() - Method in class org.apache.crunch.types.avro.AvroType
getOutputMapFn() - Method in interface org.apache.crunch.types.PType
getOutputMapFn() - Method in class org.apache.crunch.types.writable.WritableType
getParallelDoOptions() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
getParallelism() - Method in class org.apache.crunch.CreateOptions
getParents() - Method in class org.apache.crunch.impl.dist.collect.BaseDoCollection
getParents() - Method in class org.apache.crunch.impl.dist.collect.BaseDoTable
getParents() - Method in class org.apache.crunch.impl.dist.collect.BaseGroupedTable
getParents() - Method in class org.apache.crunch.impl.dist.collect.BaseInputCollection
getParents() - Method in class org.apache.crunch.impl.dist.collect.BaseInputTable
getParents() - Method in class org.apache.crunch.impl.dist.collect.BaseUnionCollection
getParents() - Method in class org.apache.crunch.impl.dist.collect.BaseUnionTable
getParents() - Method in class org.apache.crunch.impl.dist.collect.EmptyPCollection
getParents() - Method in class org.apache.crunch.impl.dist.collect.EmptyPTable
getParents() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
getParents() - Method in class org.apache.crunch.impl.spark.collect.CreatedCollection
getParents() - Method in class org.apache.crunch.impl.spark.collect.CreatedTable
getPartition(Object) - Method in class org.apache.crunch.impl.spark.SparkPartitioner
getPartition() - Method in class org.apache.crunch.kafka.offset.hdfs.Offsets.PartitionOffset: Returns the partition
getPartition(Object, Object, int) - Method in class org.apache.crunch.lib.join.JoinUtils.AvroIndexedRecordPartitioner
getPartition(TupleWritable, Writable, int) - Method in class org.apache.crunch.lib.join.JoinUtils.TupleWritablePartitioner
getPartition(K, V, int) - Method in class org.apache.crunch.lib.sort.TotalOrderPartitioner
getPartitionerClass() - Method in class org.apache.crunch.GroupingOptions
getPartitionerClass(PTypeFamily) - Static method in class org.apache.crunch.lib.join.JoinUtils
getPartitionFile(Configuration) - Static method in class org.apache.crunch.lib.sort.TotalOrderPartitioner
getPath() - Method in interface org.apache.crunch.io.PathTarget
getPath(String) - Method in class org.apache.crunch.test.TemporaryPath: Get a Path below the temporary directory.
getPathSize(Configuration, Path) - Static method in class org.apache.crunch.io.SourceTargetHelper
getPathSize(FileSystem, Path) - Static method in class org.apache.crunch.io.SourceTargetHelper
getPathToCacheFile(Path, Configuration) - Static method in class org.apache.crunch.util.DistCache
getPersistedTimeStoragePath(Path, long) - Static method in class org.apache.crunch.kafka.offset.hdfs.HDFSOffsetWriter: Creates a Path for storing the offsets for a specified persistedTime.
getPipeline() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
getPipeline() - Method in interface org.apache.crunch.PCollection: Returns the Pipeline associated with this PCollection.
getPlanDotFile() - Method in class org.apache.crunch.impl.spark.SparkRuntime
getPlanDotFile() - Method in interface org.apache.crunch.PipelineExecution: Returns the .dot file that allows a client to graph the Crunch execution plan for this pipeline.
getPrepareHooks() - Method in class org.apache.crunch.impl.mr.MRPipeline
getPrimitiveType(Class<T>) - Static method in class org.apache.crunch.types.avro.Avros
getPrimitiveType(Class<T>) - Static method in class org.apache.crunch.types.writable.Writables
getProgress() - Method in class org.apache.crunch.kafka.inputformat.KafkaRecordReader
getPTableType() - Method in class org.apache.crunch.impl.dist.collect.BaseDoTable
getPTableType() - Method in class org.apache.crunch.impl.dist.collect.BaseInputTable
getPTableType() - Method in class org.apache.crunch.impl.dist.collect.BaseUnionTable
getPTableType() - Method in class org.apache.crunch.impl.dist.collect.EmptyPTable
getPTableType() - Method in class org.apache.crunch.impl.spark.collect.CreatedTable
getPTableType() - Method in interface org.apache.crunch.PTable: Returns the PTableType of this PTable.
getPType(PTypeFamily) - Method in interface org.apache.crunch.contrib.text.Extractor: Returns the PType associated with this data type for the given PTypeFamily.
getPType() - Method in class org.apache.crunch.impl.dist.collect.BaseDoCollection
getPType() - Method in class org.apache.crunch.impl.dist.collect.BaseDoTable
getPType() - Method in class org.apache.crunch.impl.dist.collect.BaseGroupedTable
getPType() - Method in class org.apache.crunch.impl.dist.collect.BaseInputCollection
getPType() - Method in class org.apache.crunch.impl.dist.collect.BaseInputTable
getPType() - Method in class org.apache.crunch.impl.dist.collect.BaseUnionCollection
getPType() - Method in class org.apache.crunch.impl.dist.collect.BaseUnionTable
getPType() - Method in class org.apache.crunch.impl.dist.collect.EmptyPCollection
getPType() - Method in class org.apache.crunch.impl.dist.collect.EmptyPTable
getPType() - Method in class org.apache.crunch.impl.spark.collect.CreatedCollection
getPType() - Method in class org.apache.crunch.impl.spark.collect.CreatedTable
getPType() - Method in interface org.apache.crunch.PCollection: Returns the PType of this PCollection.
getReader(Schema) - Method in class org.apache.crunch.types.avro.AvroMode: Creates a DatumReader based on the schema.
getReader(Schema) - Method in interface org.apache.crunch.types.avro.ReaderWriterFactory
getReader(Schema) - Method in class org.apache.crunch.types.avro.ReflectDataFactory
getRecommendedPartitions(PCollection<T>) - Static method in class org.apache.crunch.util.PartitionUtils
getRecommendedPartitions(PCollection<T>, Configuration) - Static method in class org.apache.crunch.util.PartitionUtils
getRecordType() - Method in class org.apache.crunch.types.avro.AvroType
getRecordWriter(TaskAttemptContext) - Method in class org.apache.crunch.types.avro.AvroOutputFormat
getRecordWriter(TaskAttemptContext) - Method in class org.apache.crunch.types.avro.AvroPathPerKeyOutputFormat
getRecordWriter(TaskAttemptContext) - Method in class org.apache.crunch.types.avro.AvroTextOutputFormat
getReduceOutputName(Configuration, Path, int) - Method in interface org.apache.crunch.io.FileNamingScheme: Get the output file name for a reduce task.
getReduceOutputName(Configuration, Path, int) - Method in class org.apache.crunch.io.SequentialFileNamingScheme
getReflectDataFactory(Configuration) - Static method in class org.apache.crunch.types.avro.Avros: Deprecated.
as of 0.9.0; use AvroMode.fromConfiguration(conf)
getResult() - Method in class org.apache.crunch.impl.spark.SparkRuntime
getResult() - Method in interface org.apache.crunch.PipelineExecution: Retrieve the result of a pipeline if it has been completed, otherwise null.
getRootFile() - Method in class org.apache.crunch.test.TemporaryPath: Get the root directory which will be deleted automatically.
getRootFileName() - Method in class org.apache.crunch.test.TemporaryPath: Get the root directory as an absolute file name.
getRootPath() - Method in class org.apache.crunch.test.TemporaryPath: Get the root directory as a Path.
getRuntimeContext() - Method in class org.apache.crunch.impl.spark.SparkRuntime
getSalary() - Method in class org.apache.crunch.test.Employee.Builder: Gets the value of the 'salary' field
getSalary() - Method in class org.apache.crunch.test.Employee: Gets the value of the 'salary' field.
getSchema() - Method in class org.apache.crunch.test.Employee
getSchema() - Method in class org.apache.crunch.test.Person
getSchema() - Method in class org.apache.crunch.types.avro.AvroType
getSecond() - Method in class org.apache.crunch.fn.CompositeMapFn
getSerializationClass() - Method in class org.apache.crunch.types.writable.WritableType
getSiblingnames() - Method in class org.apache.crunch.test.Person.Builder: Gets the value of the 'siblingnames' field
getSiblingnames() - Method in class org.apache.crunch.test.Person: Gets the value of the 'siblingnames' field.
getSize(Configuration) - Method in class org.apache.crunch.contrib.io.jdbc.DataBaseSource
getSize() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
getSize(Configuration) - Method in class org.apache.crunch.kafka.KafkaSource
getSize() - Method in interface org.apache.crunch.PCollection: Returns the size of the data represented by this PCollection in bytes.
getSize(Configuration) - Method in interface org.apache.crunch.Source: Returns the number of bytes in this Source.
getSortComparatorClass() - Method in class org.apache.crunch.GroupingOptions
getSource() - Method in class org.apache.crunch.impl.dist.collect.BaseInputCollection
getSource() - Method in class org.apache.crunch.impl.dist.collect.BaseInputTable
getSourceTargets() - Method in class org.apache.crunch.GroupingOptions
getSourceTargets() - Method in class org.apache.crunch.ParallelDoOptions: Deprecated.
getSourceTargets() - Method in interface org.apache.crunch.ReadableData
getSourceTargets() - Method in class org.apache.crunch.util.DelegatingReadableData
getSourceTargets() - Method in class org.apache.crunch.util.UnionReadableData
getSparkContext() - Method in class org.apache.crunch.impl.spark.SparkRuntime
getSpecificClassLoader() - Static method in class org.apache.crunch.types.avro.AvroMode: Get the configured ClassLoader to be used for loading Avro org.apache.specific.SpecificRecord and reflection implementation classes.
getSplits(JobContext) - Method in class org.apache.crunch.kafka.inputformat.KafkaInputFormat
getStageId() - Method in class org.apache.crunch.PipelineResult.StageResult
getStageName() - Method in class org.apache.crunch.PipelineResult.StageResult
getStageResults() - Method in class org.apache.crunch.PipelineResult
getStartingOffset() - Method in class org.apache.crunch.kafka.inputformat.KafkaInputSplit: Returns the starting offset for the split
getStartTimeMsec() - Method in class org.apache.crunch.PipelineResult.StageResult
getStats() - Method in class org.apache.crunch.contrib.text.AbstractCompositeExtractor
getStats() - Method in class org.apache.crunch.contrib.text.AbstractSimpleExtractor
getStats() - Method in interface org.apache.crunch.contrib.text.Extractor: Return statistics about how many errors this Extractor instance encountered while parsing input data.
getStatus() - Method in class org.apache.crunch.impl.spark.SparkRuntime
getStatus() - Method in interface org.apache.crunch.PipelineExecution
getStorageLevel(PCollection<?>) - Method in class org.apache.crunch.impl.spark.SparkRuntime
getStoredOffsetPersistenceTimes() - Method in class org.apache.crunch.kafka.offset.AbstractOffsetReader
getStoredOffsetPersistenceTimes() - Method in class org.apache.crunch.kafka.offset.hdfs.HDFSOffsetReader
getStoredOffsetPersistenceTimes() - Method in interface org.apache.crunch.kafka.offset.OffsetReader: Returns the list of available persistence times offsets have been written to the underlying storage mechanism.
getStructFieldData(Object, StructField) - Method in class org.apache.crunch.types.orc.TupleObjectInspector
getStructFieldRef(String) - Method in class org.apache.crunch.types.orc.TupleObjectInspector
getStructFieldsDataAsList(Object) - Method in class org.apache.crunch.types.orc.TupleObjectInspector
getSubTypes() - Method in class org.apache.crunch.types.avro.AvroType
getSubTypes() - Method in class org.apache.crunch.types.PGroupedTableType
getSubTypes() - Method in interface org.apache.crunch.types.PType: Returns the sub-types that make up this PType if it is a composite instance, such as a tuple.
getSubTypes() - Method in class org.apache.crunch.types.writable.WritableType
getTableType() - Method in class org.apache.crunch.kafka.KafkaSource
getTableType() - Method in interface org.apache.crunch.TableSource
getTableType() - Method in class org.apache.crunch.types.PGroupedTableType
getTargetDependencies() - Method in class org.apache.crunch.impl.dist.collect.BaseGroupedTable
getTargetDependencies() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
getTargets() - Method in class org.apache.crunch.ParallelDoOptions
getTestContext(Configuration) - Static method in class org.apache.crunch.test.CrunchTestSupport: The method creates a TaskInputOutputContext which can be used in unit tests.
getTopic() - Method in class org.apache.crunch.kafka.offset.hdfs.Offsets.PartitionOffset: Returns the topic
getTopicPartition() - Method in class org.apache.crunch.kafka.inputformat.KafkaInputSplit: Returns the topic and partition for the split
getTupleFactory(Class<T>) - Static method in class org.apache.crunch.types.TupleFactory: Get the TupleFactory for a given Tuple implementation.
getType() - Method in class org.apache.crunch.kafka.KafkaSource
getType() - Method in interface org.apache.crunch.Source: Returns the PType for this source.
getTypeClass() - Method in class org.apache.crunch.types.avro.AvroType
getTypeClass() - Method in interface org.apache.crunch.types.PType: Returns the Java type represented by this PType.
getTypeClass() - Method in class org.apache.crunch.types.writable.WritableType
getTypeFamily() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
getTypeFamily() - Method in interface org.apache.crunch.PCollection: Returns the PTypeFamily of this PCollection.
getTypeInfo(Class<?>) - Static method in class org.apache.crunch.types.orc.OrcUtils: Generate TypeInfo for a given java class based on reflection
getTypeName() - Method in class org.apache.crunch.types.orc.TupleObjectInspector
getValue() - Method in interface org.apache.crunch.PObject: Gets the value associated with this PObject.
getValue() - Method in class org.apache.crunch.types.writable.UnionWritable
getValue() - Method in class org.apache.crunch.Union: Returns the underlying object value of the record.
getValue() - Method in class org.apache.hadoop.mapred.SparkCounter
getValueClass() - Method in interface org.apache.crunch.types.Converter
getValues() - Method in class org.apache.crunch.TupleN
getValueType() - Method in class org.apache.crunch.impl.dist.collect.PTableBase
getValueType() - Method in interface org.apache.crunch.PTable: Returns the PType of the value.
getValueType() - Method in interface org.apache.crunch.types.PTableType: Returns the value type for the table.
getWriter(Schema) - Method in class org.apache.crunch.types.avro.AvroMode: Creates a DatumWriter based on the schema.
getWriter(Schema) - Method in interface org.apache.crunch.types.avro.ReaderWriterFactory
getWriter(Schema) - Method in class org.apache.crunch.types.avro.ReflectDataFactory
globalToplist(PCollection<X>) - Static method in class org.apache.crunch.lib.TopList: Create a list of unique items in the input collection with their count, sorted descending by their frequency.
groupByKey() - Method in class org.apache.crunch.impl.dist.collect.PTableBase
groupByKey(int) - Method in class org.apache.crunch.impl.dist.collect.PTableBase
groupByKey(GroupingOptions) - Method in class org.apache.crunch.impl.dist.collect.PTableBase
groupByKey() - Method in interface org.apache.crunch.lambda.LTable: Group this table by key to yield a LGroupedTable
groupByKey(int) - Method in interface org.apache.crunch.lambda.LTable: Group this table by key to yield a LGroupedTable
groupByKey(GroupingOptions) - Method in interface org.apache.crunch.lambda.LTable: Group this table by key to yield a LGroupedTable
groupByKey() - Method in interface org.apache.crunch.PTable: Performs a grouping operation on the keys of this table.
groupByKey(int) - Method in interface org.apache.crunch.PTable: Performs a grouping operation on the keys of this table, using the given number of partitions.
groupByKey(GroupingOptions) - Method in interface org.apache.crunch.PTable: Performs a grouping operation on the keys of this table, using the additional GroupingOptions to control how the grouping is executed.
groupedWeightedReservoirSample(PTable<Integer, Pair<T, N>>, int[]) - Static method in class org.apache.crunch.lib.Sample: The most general purpose of the weighted reservoir sampling patterns that allows us to choose a random sample of elements for each of N input groups.
groupedWeightedReservoirSample(PTable<Integer, Pair<T, N>>, int[], Long) - Static method in class org.apache.crunch.lib.Sample: Same as the other groupedWeightedReservoirSample method, but include a seed for testing purposes.
groupingComparatorClass(Class<? extends RawComparator>) - Method in class org.apache.crunch.GroupingOptions.Builder
GroupingOptions - Class in org.apache.crunch: Options that can be passed to a groupByKey operation in order to exercise finer control over how the partitioning, grouping, and sorting of keys is performed.
GroupingOptions.Builder - Class in org.apache.crunch: Builder class for creating GroupingOptions instances.
GuavaUtils - Class in org.apache.crunch.impl.spark
GuavaUtils() - Constructor for class org.apache.crunch.impl.spark.GuavaUtils
gzip(T) - Static method in class org.apache.crunch.io.Compress: Configure the given output target to be compressed using Gzip.

H

handleExisting(Target.WriteMode, long, Configuration) - Method in interface org.apache.crunch.Target: Apply the given WriteMode to this Target instance.
handleOutputs(Configuration, Path, int) - Method in interface org.apache.crunch.io.PathTarget: Handles moving the output data for this target from a temporary location on the filesystem to its target path at the end of a MapReduce job.
has(int) - Method in class org.apache.crunch.types.writable.TupleWritable: Return true if tuple has an element at the position provided.
hasAge() - Method in class org.apache.crunch.test.Person.Builder: Checks whether the 'age' field has been set
hasDepartment() - Method in class org.apache.crunch.test.Employee.Builder: Checks whether the 'department' field has been set
hashCode() - Method in class org.apache.crunch.impl.dist.collect.BaseInputCollection
hashCode() - Method in class org.apache.crunch.impl.dist.collect.BaseInputTable
hashCode() - Method in class org.apache.crunch.impl.spark.ByteArray
hashCode() - Method in class org.apache.crunch.impl.spark.IntByteArray
hashCode() - Method in class org.apache.crunch.io.FormatBundle
hashCode() - Method in class org.apache.crunch.kafka.offset.hdfs.Offsets
hashCode() - Method in class org.apache.crunch.kafka.offset.hdfs.Offsets.PartitionOffset
hashCode() - Method in class org.apache.crunch.lib.Quantiles.Result
hashCode() - Method in class org.apache.crunch.Pair
hashCode() - Method in class org.apache.crunch.Tuple3
hashCode() - Method in class org.apache.crunch.Tuple4
hashCode() - Method in class org.apache.crunch.TupleN
hashCode() - Method in class org.apache.crunch.types.avro.AvroMode
hashCode() - Method in class org.apache.crunch.types.avro.AvroType
hashCode() - Method in class org.apache.crunch.types.writable.TupleWritable
hashCode() - Method in class org.apache.crunch.types.writable.WritableType
hashCode() - Method in class org.apache.crunch.Union
HashUtil - Class in org.apache.crunch.util: Utility methods for working with hash codes.
HashUtil() - Constructor for class org.apache.crunch.util.HashUtil
hasName() - Method in class org.apache.crunch.test.Employee.Builder: Checks whether the 'name' field has been set
hasName() - Method in class org.apache.crunch.test.Person.Builder: Checks whether the 'name' field has been set
hasNext() - Method in class org.apache.crunch.contrib.text.Tokenizer: Returns true if the underlying Scanner has any tokens remaining.
hasNext() - Method in class org.apache.crunch.util.DoFnIterator
hasReflect() - Method in class org.apache.crunch.types.avro.AvroType: Determine if the wrapped type is a reflection-based avro type or wraps one.
hasSalary() - Method in class org.apache.crunch.test.Employee.Builder: Checks whether the 'salary' field has been set
hasSiblingnames() - Method in class org.apache.crunch.test.Person.Builder: Checks whether the 'siblingnames' field has been set
hasSpecific() - Method in class org.apache.crunch.types.avro.AvroType: Determine if the wrapped type is a specific data avro type or wraps one.
HDFSOffsetReader - Class in org.apache.crunch.kafka.offset.hdfs: Reader implementation that reads offset information from HDFS.
HDFSOffsetReader(Configuration, Path) - Constructor for class org.apache.crunch.kafka.offset.hdfs.HDFSOffsetReader: Creates a reader instance for interacting with the storage specified by the config and with the base storage path of baseStoragePath.
HDFSOffsetWriter - Class in org.apache.crunch.kafka.offset.hdfs: Offset writer implementation that stores the offsets in HDFS.
HDFSOffsetWriter(Configuration, Path) - Constructor for class org.apache.crunch.kafka.offset.hdfs.HDFSOffsetWriter: Creates a writer instance for interacting with the storage specified by the config and with the base storage path of baseStoragePath.

I

id - Variable in class org.apache.crunch.contrib.io.jdbc.IdentifiableName
IdentifiableName - Class in org.apache.crunch.contrib.io.jdbc
IdentifiableName() - Constructor for class org.apache.crunch.contrib.io.jdbc.IdentifiableName
IdentityFn<T> - Class in org.apache.crunch.fn
immutableType(Class<T>, Class<W>, MapFn<W, T>, MapFn<T, W>, PType...) - Static method in class org.apache.crunch.types.writable.WritableType: Factory method for a new WritableType instance whose type class is immutable.
increment(Enum<?>) - Method in interface org.apache.crunch.lambda.LCollection: Increment a counter for every element in the collection
increment(String, String) - Method in interface org.apache.crunch.lambda.LCollection: Increment a counter for every element in the collection
increment(String, String) - Method in interface org.apache.crunch.lambda.LDoFnContext: Increment a counter by 1
increment(String, String, long) - Method in interface org.apache.crunch.lambda.LDoFnContext: Increment a counter by value
increment(Enum<?>) - Method in interface org.apache.crunch.lambda.LDoFnContext: Increment a counter by 1
increment(Enum<?>, long) - Method in interface org.apache.crunch.lambda.LDoFnContext: Increment a counter by value
increment(Enum<?>) - Method in interface org.apache.crunch.lambda.LTable: Increment a counter for every element in the collection
increment(String, String) - Method in interface org.apache.crunch.lambda.LTable: Increment a counter for every element in the collection
increment(long) - Method in class org.apache.hadoop.mapred.SparkCounter
incrementIf(Enum<?>, SPredicate<S>) - Method in interface org.apache.crunch.lambda.LCollection: Increment a counter for every element satisfying the conditional predicate supplied.
incrementIf(String, String, SPredicate<S>) - Method in interface org.apache.crunch.lambda.LCollection: Increment a counter for every element satisfying the conditional predicate supplied.
incrementIf(Enum<?>, SPredicate<Pair<K, V>>) - Method in interface org.apache.crunch.lambda.LTable: Increment a counter for every element satisfying the conditional predicate supplied.
incrementIf(String, String, SPredicate<Pair<K, V>>) - Method in interface org.apache.crunch.lambda.LTable: Increment a counter for every element satisfying the conditional predicate supplied.
initialize(Configuration) - Method in interface org.apache.crunch.Aggregator: Perform any setup of this instance that is required prior to processing inputs.
initialize() - Method in class org.apache.crunch.contrib.bloomfilter.BloomFilterFn
initialize() - Method in class org.apache.crunch.contrib.text.AbstractCompositeExtractor
initialize() - Method in class org.apache.crunch.contrib.text.AbstractSimpleExtractor
initialize() - Method in interface org.apache.crunch.contrib.text.Extractor: Perform any initialization required by this Extractor during the start of a map or reduce task.
initialize() - Method in class org.apache.crunch.DoFn: Initialize this DoFn.
initialize(Configuration) - Method in class org.apache.crunch.fn.Aggregators.SimpleAggregator
initialize() - Method in class org.apache.crunch.fn.CompositeMapFn
initialize() - Method in class org.apache.crunch.fn.ExtractKeyFn
initialize() - Method in class org.apache.crunch.fn.PairMapFn
initialize(DoFn<?, ?>, Integer) - Method in class org.apache.crunch.impl.spark.SparkRuntimeContext
initialize(InputSplit, TaskAttemptContext) - Method in class org.apache.crunch.kafka.inputformat.KafkaRecordReader
initialize() - Method in class org.apache.crunch.lib.Aggregate.TopKCombineFn
initialize() - Method in class org.apache.crunch.lib.Aggregate.TopKFn
initialize() - Method in class org.apache.crunch.lib.join.FullOuterJoinFn: Initialize this DoFn.
initialize() - Method in class org.apache.crunch.lib.join.InnerJoinFn
initialize() - Method in class org.apache.crunch.lib.join.JoinFn
initialize() - Method in class org.apache.crunch.lib.join.LeftOuterJoinFn: Initialize this DoFn.
initialize() - Method in class org.apache.crunch.lib.join.RightOuterJoinFn: Initialize this DoFn.
initialize() - Method in class org.apache.crunch.lib.sort.SortFns.AvroGenericFn
initialize(Configuration) - Method in class org.apache.crunch.types.avro.AvroDerivedValueDeepCopier
initialize(Configuration) - Method in class org.apache.crunch.types.avro.AvroType
initialize(Configuration) - Method in class org.apache.crunch.types.CollectionDeepCopier
initialize(Configuration) - Method in interface org.apache.crunch.types.DeepCopier: Initialize the deep copier with a job-specific configuration
initialize(Configuration) - Method in class org.apache.crunch.types.MapDeepCopier
initialize(Configuration) - Method in class org.apache.crunch.types.NoOpDeepCopier
initialize() - Method in class org.apache.crunch.types.PGroupedTableType.PairIterableMapFn
initialize(Configuration) - Method in interface org.apache.crunch.types.PType: Initialize this PType for use within a DoFn.
initialize(Configuration) - Method in class org.apache.crunch.types.TupleDeepCopier
initialize() - Method in class org.apache.crunch.types.TupleFactory
initialize(Configuration) - Method in class org.apache.crunch.types.UnionDeepCopier
initialize(Configuration) - Method in class org.apache.crunch.types.writable.WritableDeepCopier
initialize(Configuration) - Method in class org.apache.crunch.types.writable.WritableType
inMemory(PTable<K, V>, double, double...) - Static method in class org.apache.crunch.lib.Quantiles: Calculate a set of quantiles for each key in a numerically-valued table.
innerJoin(PTable<K, U>, PTable<K, V>) - Static method in class org.apache.crunch.lib.Join: Performs an inner join on the specified PTables.
InnerJoinFn<K,U,V> - Class in org.apache.crunch.lib.join: Used to perform the last step of an inner join.
InnerJoinFn(PType<K>, PType) - Constructor for class org.apache.crunch.lib.join.InnerJoinFn
InputCollection<S> - Class in org.apache.crunch.impl.spark.collect
inputConf(String, String) - Method in class org.apache.crunch.kafka.KafkaSource
inputConf(String, String) - Method in interface org.apache.crunch.Source: Adds the given key-value pair to the Configuration instance that is used to read this Source<T></T>.
InputConverterFunction<K,V,S> - Class in org.apache.crunch.impl.spark.fn
InputConverterFunction(Converter<K, V, S, ?>) - Constructor for class org.apache.crunch.impl.spark.fn.InputConverterFunction
InputTable<K,V> - Class in org.apache.crunch.impl.spark.collect
InputTable(TableSource<K, V>, String, DistributedPipeline, ParallelDoOptions) - Constructor for class org.apache.crunch.impl.spark.collect.InputTable
IntByteArray - Class in org.apache.crunch.impl.spark
IntByteArray(int, ByteArray) - Constructor for class org.apache.crunch.impl.spark.IntByteArray
intersection(PCollection<T>, PCollection<T>) - Static method in class org.apache.crunch.lib.Set: Compute the intersection of two sets of elements.
ints() - Static method in class org.apache.crunch.types.avro.Avros
ints() - Method in class org.apache.crunch.types.avro.AvroTypeFamily
ints() - Method in interface org.apache.crunch.types.PTypeFamily
ints() - Static method in class org.apache.crunch.types.writable.Writables
ints() - Method in class org.apache.crunch.types.writable.WritableTypeFamily
isBreakpoint() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
isCompatibleWith(GroupingOptions) - Method in class org.apache.crunch.GroupingOptions
isGeneric() - Method in class org.apache.crunch.types.avro.AvroType: Determine if the wrapped type is a generic data avro type.
isValid(JavaRDDLike<?, ?>) - Method in class org.apache.crunch.impl.spark.SparkRuntime
iterator() - Method in class org.apache.crunch.impl.SingleUseIterable
iterator() - Method in class org.apache.crunch.impl.spark.fn.CrunchIterable
iterator() - Method in class org.apache.crunch.io.CompositePathIterable
iterator() - Method in class org.apache.crunch.util.Tuples.PairIterable
iterator() - Method in class org.apache.crunch.util.Tuples.QuadIterable
iterator() - Method in class org.apache.crunch.util.Tuples.TripIterable
iterator() - Method in class org.apache.crunch.util.Tuples.TupleNIterable

J

join(PTable<K, U>) - Method in class org.apache.crunch.impl.dist.collect.PTableBase
join(LTable<K, U>, JoinType, JoinStrategy<K, V, U>) - Method in interface org.apache.crunch.lambda.LTable: Join this table to another LTable which has the same key type using the provided JoinType and JoinStrategy
join(LTable<K, U>, JoinType) - Method in interface org.apache.crunch.lambda.LTable: Join this table to another LTable which has the same key type using the provide JoinType and the DefaultJoinStrategy (reduce-side join).
join(LTable<K, U>) - Method in interface org.apache.crunch.lambda.LTable: Inner join this table to another LTable which has the same key type using a reduce-side join
Join - Class in org.apache.crunch.lib: Utilities for joining multiple PTable instances based on a common lastKey.
Join() - Constructor for class org.apache.crunch.lib.Join
join(PTable<K, U>, PTable<K, V>, JoinType) - Method in class org.apache.crunch.lib.join.BloomFilterJoinStrategy
join(PTable<K, U>, PTable<K, V>, JoinType) - Method in class org.apache.crunch.lib.join.DefaultJoinStrategy
join(PTable<K, U>, PTable<K, V>, JoinFn<K, U, V>) - Method in class org.apache.crunch.lib.join.DefaultJoinStrategy: Perform a default join on the given PTable instances using a user-specified JoinFn.
join(K, int, Iterable<Pair<U, V>>, Emitter<Pair<K, Pair<U, V>>>) - Method in class org.apache.crunch.lib.join.FullOuterJoinFn: Performs the actual joining.
join(K, int, Iterable<Pair<U, V>>, Emitter<Pair<K, Pair<U, V>>>) - Method in class org.apache.crunch.lib.join.InnerJoinFn
join(PTable<K, U>, PTable<K, V>) - Static method in class org.apache.crunch.lib.Join: Performs an inner join on the specified PTables.
join(K, int, Iterable<Pair<U, V>>, Emitter<Pair<K, Pair<U, V>>>) - Method in class org.apache.crunch.lib.join.JoinFn: Performs the actual joining.
join(PTable<K, U>, PTable<K, V>, JoinType) - Method in interface org.apache.crunch.lib.join.JoinStrategy: Join two tables with the given join type.
join(K, int, Iterable<Pair<U, V>>, Emitter<Pair<K, Pair<U, V>>>) - Method in class org.apache.crunch.lib.join.LeftOuterJoinFn: Performs the actual joining.
join(PTable<K, U>, PTable<K, V>, JoinType) - Method in class org.apache.crunch.lib.join.MapsideJoinStrategy
join(K, int, Iterable<Pair<U, V>>, Emitter<Pair<K, Pair<U, V>>>) - Method in class org.apache.crunch.lib.join.RightOuterJoinFn: Performs the actual joining.
join(PTable<K, U>, PTable<K, V>, JoinType) - Method in class org.apache.crunch.lib.join.ShardedJoinStrategy
join(PTable<K, U>) - Method in interface org.apache.crunch.PTable: Perform an inner join on this table and the one passed in as an argument on their common keys.
JoinFn<K,U,V> - Class in org.apache.crunch.lib.join: Represents a DoFn for performing joins.
JoinFn(PType<K>, PType) - Constructor for class org.apache.crunch.lib.join.JoinFn: Instantiate with the PType of the value of the left side of the join (used for creating deep copies of values).
JoinStrategy<K,U,V> - Interface in org.apache.crunch.lib.join: Defines a strategy for joining two PTables together on a common key.
JoinType - Enum in org.apache.crunch.lib.join: Specifies the specific behavior of how a join should be performed in terms of requiring matching keys on both sides of the join.
JoinUtils - Class in org.apache.crunch.lib.join: Utilities that are useful in joining multiple data sets via a MapReduce.
JoinUtils() - Constructor for class org.apache.crunch.lib.join.JoinUtils
JoinUtils.AvroIndexedRecordPartitioner - Class in org.apache.crunch.lib.join
JoinUtils.AvroPairGroupingComparator<T> - Class in org.apache.crunch.lib.join
JoinUtils.TupleWritableComparator - Class in org.apache.crunch.lib.join
JoinUtils.TupleWritablePartitioner - Class in org.apache.crunch.lib.join
jsons(Class<T>) - Static method in class org.apache.crunch.types.avro.Avros
jsons(Class<T>) - Static method in class org.apache.crunch.types.writable.Writables
jsonString(Class<T>, PTypeFamily) - Static method in class org.apache.crunch.types.PTypes: Constructs a PType for reading a Java type from a JSON string using Jackson's ObjectMapper.

K

KAFKA_EMPTY_RETRY_ATTEMPTS_KEY - Static variable in class org.apache.crunch.kafka.KafkaUtils: Configuration property for the number of retry attempts that will be made to Kafka in the event of getting empty responses.
KAFKA_RETRY_ATTEMPTS_DEFAULT - Static variable in class org.apache.crunch.kafka.KafkaUtils: Default number of retry attempts.
KAFKA_RETRY_ATTEMPTS_DEFAULT_STRING - Static variable in class org.apache.crunch.kafka.KafkaUtils
KAFKA_RETRY_ATTEMPTS_KEY - Static variable in class org.apache.crunch.kafka.KafkaUtils: Configuration property for the number of retry attempts that will be made to Kafka.
KAFKA_RETRY_EMPTY_ATTEMPTS_DEFAULT - Static variable in class org.apache.crunch.kafka.KafkaUtils: Default number of empty retry attempts.
KAFKA_RETRY_EMPTY_ATTEMPTS_DEFAULT_STRING - Static variable in class org.apache.crunch.kafka.KafkaUtils
KafkaInputFormat - Class in org.apache.crunch.kafka.inputformat: Basic input format for reading data from Kafka.
KafkaInputFormat() - Constructor for class org.apache.crunch.kafka.inputformat.KafkaInputFormat
KafkaInputSplit - Class in org.apache.crunch.kafka.inputformat: InputSplit that represent retrieving data from a single TopicPartition between the specified start and end offsets.
KafkaInputSplit() - Constructor for class org.apache.crunch.kafka.inputformat.KafkaInputSplit: Nullary Constructor for creating the instance inside the Mapper instance.
KafkaInputSplit(String, int, long, long) - Constructor for class org.apache.crunch.kafka.inputformat.KafkaInputSplit: Constructs an input split for the provided topic and partition restricting data to be between the startingOffset and endingOffset
KafkaRecordReader<K,V> - Class in org.apache.crunch.kafka.inputformat: A RecordReader for pulling data from Kafka.
KafkaRecordReader() - Constructor for class org.apache.crunch.kafka.inputformat.KafkaRecordReader
KafkaSource - Class in org.apache.crunch.kafka: A Crunch Source that will retrieve events from Kafka given start and end offsets.
KafkaSource(Properties, Map<TopicPartition, Pair<Long, Long>>) - Constructor for class org.apache.crunch.kafka.KafkaSource: Constructs a Kafka source that will read data from the Kafka cluster identified by the kafkaConnectionProperties and from the specific topics and partitions identified in the offsets
KafkaSource.BytesDeserializer - Class in org.apache.crunch.kafka: Basic Deserializer which simply wraps the payload as a BytesWritable.
KafkaUtils - Class in org.apache.crunch.kafka: Simple utilities for retrieving offset and Kafka information to assist in setting up and configuring a KafkaSource instance.
KafkaUtils() - Constructor for class org.apache.crunch.kafka.KafkaUtils
keep(Integer...) - Method in class org.apache.crunch.contrib.text.TokenizerFactory.Builder: Keep only the specified fields found by the input scanner, counting from zero.
keyClass - Variable in class org.apache.crunch.io.CrunchOutputs.OutputConfig
KeyExtraction(PType<V>, Sort.ColumnOrder[]) - Constructor for class org.apache.crunch.lib.sort.SortFns.KeyExtraction
keys() - Method in class org.apache.crunch.impl.dist.collect.PTableBase
keys() - Method in interface org.apache.crunch.lambda.LTable: Get an LCollection containing just the keys from this table
keys(PTable<K, V>) - Static method in class org.apache.crunch.lib.PTables: Extract the keys from the given PTable<K, V> as a PCollection<K>.
keys() - Method in interface org.apache.crunch.PTable: Returns a PCollection made up of the keys in this PTable.
keyType() - Method in interface org.apache.crunch.lambda.LGroupedTable: Get a PType which can be used to serialize the key part of this grouped table
keyType() - Method in interface org.apache.crunch.lambda.LTable: Get a PType which can be used to serialize the key part of this table
keyValueTableOf(PType<K>, PType<V>) - Static method in class org.apache.crunch.types.avro.Avros: A table type with an Avro type as key and value.
kill() - Method in class org.apache.crunch.impl.spark.SparkRuntime
kill() - Method in interface org.apache.crunch.PipelineExecution: Kills the pipeline if it is running, no-op otherwise.

L

LAggregator<V,A> - Class in org.apache.crunch.lambda: Crunch Aggregator expressed as a composition of functional interface implementations
LAggregator(SSupplier<A>, SBiFunction<A, V, A>, SFunction<A, Iterable<V>>) - Constructor for class org.apache.crunch.lambda.LAggregator
Lambda - Class in org.apache.crunch.lambda: Entry point for the crunch-lambda API.
Lambda() - Constructor for class org.apache.crunch.lambda.Lambda
LAST_N(int) - Static method in class org.apache.crunch.fn.Aggregators: Return the last n values (or fewer if there are fewer values than n).
LCollection<S> - Interface in org.apache.crunch.lambda: Java 8 friendly version of the PCollection interface, allowing distributed operations to be expressed in terms of lambda expressions and method references, instead of creating a new class implementation for each operation.
LCollectionFactory - Interface in org.apache.crunch.lambda: Factory for creating LCollection, LTable and LGroupedTable objects from their corresponding PCollection, PTable and PGroupedTable types.
LDoFn<S,T> - Interface in org.apache.crunch.lambda: A Java lambdas friendly version of the DoFn class.
LDoFnContext<S,T> - Interface in org.apache.crunch.lambda: Context object for implementing distributed operations in terms of Lambda expressions.
leftJoin(PTable<K, U>, PTable<K, V>) - Static method in class org.apache.crunch.lib.Join: Performs a left outer join on the specified PTables.
LeftOuterJoinFn<K,U,V> - Class in org.apache.crunch.lib.join: Used to perform the last step of an left outer join.
LeftOuterJoinFn(PType<K>, PType) - Constructor for class org.apache.crunch.lib.join.LeftOuterJoinFn
length() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
length(PCollection<S>) - Static method in class org.apache.crunch.lib.Aggregate: Returns the number of elements in the provided PCollection.
length() - Method in interface org.apache.crunch.PCollection: Returns the number of elements represented by this PCollection.
LGroupedTable<K,V> - Interface in org.apache.crunch.lambda: Java 8 friendly version of the PGroupedTable interface, allowing distributed operations to be expressed in terms of lambda expressions and method references, instead of creating a new class implementation for each operation.
lineParser(String, Class<M>) - Static method in class org.apache.crunch.types.Protos
locale(Locale) - Method in class org.apache.crunch.contrib.text.TokenizerFactory.Builder: Sets the Locale to use with the TokenizerFactory returned by this Builder instance.
longs() - Static method in class org.apache.crunch.types.avro.Avros
longs() - Method in class org.apache.crunch.types.avro.AvroTypeFamily
longs() - Method in interface org.apache.crunch.types.PTypeFamily
longs() - Static method in class org.apache.crunch.types.writable.Writables
longs() - Method in class org.apache.crunch.types.writable.WritableTypeFamily
LTable<K,V> - Interface in org.apache.crunch.lambda: Java 8 friendly version of the PTable interface, allowing distributed operations to be expressed in terms of lambda expressions and method references, instead of creating a new class implementation for each operation.

M

main(String[]) - Static method in class org.apache.crunch.examples.AverageBytesByIP
main(String[]) - Static method in class org.apache.crunch.examples.SecondarySortExample
main(String[]) - Static method in class org.apache.crunch.examples.SortExample
main(String[]) - Static method in class org.apache.crunch.examples.TotalBytesByIP
main(String[]) - Static method in class org.apache.crunch.examples.TotalWordCount
main(String[]) - Static method in class org.apache.crunch.examples.WordAggregationHBase
main(String[]) - Static method in class org.apache.crunch.examples.WordCount
makeTuple(Object...) - Method in class org.apache.crunch.types.TupleFactory
map(R) - Method in class org.apache.crunch.fn.CompositeMapFn
map(V) - Method in class org.apache.crunch.fn.ExtractKeyFn
map(T) - Method in class org.apache.crunch.fn.IdentityFn
map(Pair<K, V>) - Method in class org.apache.crunch.fn.PairMapFn
map(T) - Method in class org.apache.crunch.fn.SDoubleFunction
map(T) - Method in class org.apache.crunch.fn.SFunction
map(Pair<K, V>) - Method in class org.apache.crunch.fn.SFunction2
map(T) - Method in class org.apache.crunch.fn.SPairFunction
map(Pair<V1, V2>) - Method in class org.apache.crunch.fn.SwapFn
map(SFunction<S, T>, PType<T>) - Method in interface org.apache.crunch.lambda.LCollection: Map the elements of this collection 1-1 through the supplied function.
map(SFunction<S, Pair<K, V>>, PTableType<K, V>) - Method in interface org.apache.crunch.lambda.LCollection: Map the elements of this collection 1-1 through the supplied function to yield an LTable
map(PTable<K1, V1>, Class<? extends Mapper<K1, V1, K2, V2>>, Class<K2>, Class<V2>) - Static method in class org.apache.crunch.lib.Mapred
map(PTable<K1, V1>, Class<? extends Mapper<K1, V1, K2, V2>>, Class<K2>, Class<V2>) - Static method in class org.apache.crunch.lib.Mapreduce
map(V) - Method in class org.apache.crunch.lib.sort.SortFns.AvroGenericFn
map(V) - Method in class org.apache.crunch.lib.sort.SortFns.SingleKeyFn
map(V) - Method in class org.apache.crunch.lib.sort.SortFns.TupleKeyFn
map(S) - Method in class org.apache.crunch.MapFn: Maps the given input into an instance of the output type.
map(Pair<Object, Iterable<Object>>) - Method in class org.apache.crunch.types.PGroupedTableType.PairIterableMapFn
MapDeepCopier<T> - Class in org.apache.crunch.types
MapDeepCopier(PType<T>) - Constructor for class org.apache.crunch.types.MapDeepCopier
MapFn<S,T> - Class in org.apache.crunch: A DoFn for the common case of emitting exactly one value for each input record.
MapFn() - Constructor for class org.apache.crunch.MapFn
MapFunction - Class in org.apache.crunch.impl.spark.fn
MapFunction(MapFn, SparkRuntimeContext) - Constructor for class org.apache.crunch.impl.spark.fn.MapFunction
mapKeys(MapFn<K, K2>, PType<K2>) - Method in class org.apache.crunch.impl.dist.collect.PTableBase
mapKeys(String, MapFn<K, K2>, PType<K2>) - Method in class org.apache.crunch.impl.dist.collect.PTableBase
mapKeys(SFunction<K, T>, PType<T>) - Method in interface org.apache.crunch.lambda.LTable: Transform the keys of this table using the given function
mapKeys(PTable<K1, V>, MapFn<K1, K2>, PType<K2>) - Static method in class org.apache.crunch.lib.PTables: Maps a PTable<K1, V> to a PTable<K2, V> using the given MapFn<K1, K2> on the keys of the PTable.
mapKeys(String, PTable<K1, V>, MapFn<K1, K2>, PType<K2>) - Static method in class org.apache.crunch.lib.PTables: Maps a PTable<K1, V> to a PTable<K2, V> using the given MapFn<K1, K2> on the keys of the PTable.
mapKeys(MapFn<K, K2>, PType<K2>) - Method in interface org.apache.crunch.PTable: Returns a PTable that has the same values as this instance, but uses the given function to map the keys.
mapKeys(String, MapFn<K, K2>, PType<K2>) - Method in interface org.apache.crunch.PTable: Returns a PTable that has the same values as this instance, but uses the given function to map the keys.
MapOutputFunction<K,V> - Class in org.apache.crunch.impl.spark.fn
MapOutputFunction(SerDe, SerDe) - Constructor for class org.apache.crunch.impl.spark.fn.MapOutputFunction
Mapred - Class in org.apache.crunch.lib: Static functions for working with legacy Mappers and Reducers that live under the org.apache.hadoop.mapred.* package as part of Crunch pipelines.
Mapred() - Constructor for class org.apache.crunch.lib.Mapred
Mapreduce - Class in org.apache.crunch.lib: Static functions for working with legacy Mappers and Reducers that live under the org.apache.hadoop.mapreduce.* package as part of Crunch pipelines.
Mapreduce() - Constructor for class org.apache.crunch.lib.Mapreduce
MapReduceTarget - Interface in org.apache.crunch.io
maps(PType<T>) - Static method in class org.apache.crunch.types.avro.Avros
maps(PType<T>) - Method in class org.apache.crunch.types.avro.AvroTypeFamily
maps(PType<T>) - Method in interface org.apache.crunch.types.PTypeFamily
maps(PType<T>) - Static method in class org.apache.crunch.types.writable.Writables
maps(PType<T>) - Method in class org.apache.crunch.types.writable.WritableTypeFamily
MapsideJoinStrategy<K,U,V> - Class in org.apache.crunch.lib.join: Utility for doing map side joins on a common key between two PTables.
MapsideJoinStrategy() - Constructor for class org.apache.crunch.lib.join.MapsideJoinStrategy: Deprecated.
Use the MapsideJoinStrategy.create() factory method instead
MapsideJoinStrategy(boolean) - Constructor for class org.apache.crunch.lib.join.MapsideJoinStrategy: Deprecated.
Use the MapsideJoinStrategy.create(boolean) factory method instead
mapValues(MapFn<Iterable<V>, U>, PType) - Method in class org.apache.crunch.impl.dist.collect.BaseGroupedTable
mapValues(String, MapFn<Iterable<V>, U>, PType) - Method in class org.apache.crunch.impl.dist.collect.BaseGroupedTable
mapValues(MapFn<V, U>, PType) - Method in class org.apache.crunch.impl.dist.collect.PTableBase
mapValues(String, MapFn<V, U>, PType) - Method in class org.apache.crunch.impl.dist.collect.PTableBase
mapValues(SFunction<Stream<V>, T>, PType<T>) - Method in interface org.apache.crunch.lambda.LGroupedTable: Map the values in this LGroupedTable using a custom function.
mapValues(SFunction<V, T>, PType<T>) - Method in interface org.apache.crunch.lambda.LTable: Transform the values of this table using the given function
mapValues(PTable<K, U>, MapFn<U, V>, PType<V>) - Static method in class org.apache.crunch.lib.PTables: Maps a PTable<K, U> to a PTable<K, V> using the given MapFn<U, V> on the values of the PTable.
mapValues(String, PTable<K, U>, MapFn<U, V>, PType<V>) - Static method in class org.apache.crunch.lib.PTables: Maps a PTable<K, U> to a PTable<K, V> using the given MapFn<U, V> on the values of the PTable.
mapValues(PGroupedTable<K, U>, MapFn<Iterable, V>, PType<V>) - Static method in class org.apache.crunch.lib.PTables: An analogue of the mapValues function for PGroupedTable<K, U> collections.
mapValues(String, PGroupedTable<K, U>, MapFn<Iterable, V>, PType<V>) - Static method in class org.apache.crunch.lib.PTables: An analogue of the mapValues function for PGroupedTable<K, U> collections.
mapValues(MapFn<Iterable<V>, U>, PType) - Method in interface org.apache.crunch.PGroupedTable: Maps the Iterable<V> elements of each record to a new type.
mapValues(String, MapFn<Iterable<V>, U>, PType) - Method in interface org.apache.crunch.PGroupedTable: Maps the Iterable<V> elements of each record to a new type.
mapValues(MapFn<V, U>, PType) - Method in interface org.apache.crunch.PTable: Returns a PTable that has the same keys as this instance, but uses the given function to map the values.
mapValues(String, MapFn<V, U>, PType) - Method in interface org.apache.crunch.PTable: Returns a PTable that has the same keys as this instance, but uses the given function to map the values.
markLogged() - Method in exception org.apache.crunch.CrunchRuntimeException: Indicate that this exception has been written to the debug logs.
materialize() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
materialize(PCollection<T>) - Method in class org.apache.crunch.impl.mem.MemPipeline
materialize(PCollection<T>) - Method in class org.apache.crunch.impl.mr.MRPipeline
materialize(PCollection<T>) - Method in class org.apache.crunch.impl.spark.SparkPipeline
materialize() - Method in interface org.apache.crunch.lambda.LCollection: Obtain the contents of this LCollection as a Stream that can be processed locally.
materialize() - Method in interface org.apache.crunch.PCollection: Returns a reference to the data set represented by this PCollection that may be used by the client to read the data locally.
materialize(PCollection<T>) - Method in interface org.apache.crunch.Pipeline: Create the given PCollection and read the data it contains into the returned Collection instance for client use.
materialize(PCollection<T>) - Method in class org.apache.crunch.util.CrunchTool
materializeAt(SourceTarget<S>) - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
materializeToMap() - Method in class org.apache.crunch.impl.dist.collect.PTableBase: Returns a Map made up of the keys and values in this PTable.
materializeToMap() - Method in interface org.apache.crunch.PTable: Returns a Map made up of the keys and values in this PTable.
max() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
max(PCollection<S>) - Static method in class org.apache.crunch.lib.Aggregate: Returns the largest numerical element from the input collection.
max() - Method in interface org.apache.crunch.PCollection: Returns a PObject of the maximum element of this instance.
MAX_BIGDECIMALS() - Static method in class org.apache.crunch.fn.Aggregators: Return the maximum of all given BigDecimal values.
MAX_BIGDECIMALS(int) - Static method in class org.apache.crunch.fn.Aggregators: Return the n largest BigDecimal values (or fewer if there are fewer values than n).
MAX_BIGINTS() - Static method in class org.apache.crunch.fn.Aggregators: Return the maximum of all given BigInteger values.
MAX_BIGINTS(int) - Static method in class org.apache.crunch.fn.Aggregators: Return the n largest BigInteger values (or fewer if there are fewer values than n).
MAX_COMPARABLES() - Static method in class org.apache.crunch.fn.Aggregators: Return the maximum of all given Comparable values.
MAX_DOUBLES() - Static method in class org.apache.crunch.fn.Aggregators: Return the maximum of all given double values.
MAX_DOUBLES(int) - Static method in class org.apache.crunch.fn.Aggregators: Return the n largest double values (or fewer if there are fewer values than n).
MAX_FLOATS() - Static method in class org.apache.crunch.fn.Aggregators: Return the maximum of all given float values.
MAX_FLOATS(int) - Static method in class org.apache.crunch.fn.Aggregators: Return the n largest float values (or fewer if there are fewer values than n).
MAX_INTS() - Static method in class org.apache.crunch.fn.Aggregators: Return the maximum of all given int values.
MAX_INTS(int) - Static method in class org.apache.crunch.fn.Aggregators: Return the n largest int values (or fewer if there are fewer values than n).
MAX_LONGS() - Static method in class org.apache.crunch.fn.Aggregators: Return the maximum of all given long values.
MAX_LONGS(int) - Static method in class org.apache.crunch.fn.Aggregators: Return the n largest long values (or fewer if there are fewer values than n).
MAX_N(int, Class<V>) - Static method in class org.apache.crunch.fn.Aggregators: Return the n largest values (or fewer if there are fewer values than n).
MAX_REDUCERS - Static variable in class org.apache.crunch.util.PartitionUtils: Set an upper limit on the number of reducers the Crunch planner will set for an MR job when it tries to determine how many reducers to use based on the input size.
MAX_UNIQUE_N(int, Class<V>) - Static method in class org.apache.crunch.fn.Aggregators: Return the n largest unique values (or fewer if there are fewer values than n).
meanValue(PTable<K, V>) - Static method in class org.apache.crunch.lib.Average: Calculate the mean average value by key for a table with numeric values.
MemPipeline - Class in org.apache.crunch.impl.mem
min() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
min(PCollection<S>) - Static method in class org.apache.crunch.lib.Aggregate: Returns the smallest numerical element from the input collection.
min() - Method in interface org.apache.crunch.PCollection: Returns a PObject of the minimum element of this instance.
MIN_BIGDECIMALS() - Static method in class org.apache.crunch.fn.Aggregators: Return the minimum of all given BigDecimal values.
MIN_BIGDECIMALS(int) - Static method in class org.apache.crunch.fn.Aggregators: Return the n smallest BigDecimal values (or fewer if there are fewer values than n).
MIN_BIGINTS() - Static method in class org.apache.crunch.fn.Aggregators: Return the minimum of all given BigInteger values.
MIN_BIGINTS(int) - Static method in class org.apache.crunch.fn.Aggregators: Return the n smallest BigInteger values (or fewer if there are fewer values than n).
MIN_COMPARABLES() - Static method in class org.apache.crunch.fn.Aggregators: Return the minimum of all given Comparable values.
MIN_DOUBLES() - Static method in class org.apache.crunch.fn.Aggregators: Return the minimum of all given double values.
MIN_DOUBLES(int) - Static method in class org.apache.crunch.fn.Aggregators: Return the n smallest double values (or fewer if there are fewer values than n).
MIN_FLOATS() - Static method in class org.apache.crunch.fn.Aggregators: Return the minimum of all given float values.
MIN_FLOATS(int) - Static method in class org.apache.crunch.fn.Aggregators: Return the n smallest float values (or fewer if there are fewer values than n).
MIN_INTS() - Static method in class org.apache.crunch.fn.Aggregators: Return the minimum of all given int values.
MIN_INTS(int) - Static method in class org.apache.crunch.fn.Aggregators: Return the n smallest int values (or fewer if there are fewer values than n).
MIN_LONGS() - Static method in class org.apache.crunch.fn.Aggregators: Return the minimum of all given long values.
MIN_LONGS(int) - Static method in class org.apache.crunch.fn.Aggregators: Return the n smallest long values (or fewer if there are fewer values than n).
MIN_N(int, Class<V>) - Static method in class org.apache.crunch.fn.Aggregators: Return the n smallest values (or fewer if there are fewer values than n).
MIN_UNIQUE_N(int, Class<V>) - Static method in class org.apache.crunch.fn.Aggregators: Returns the n smallest unique values (or fewer if there are fewer unique values than n).
MRCollection - Interface in org.apache.crunch.impl.dist.collect
MRJob - Interface in org.apache.crunch.impl.mr: A Hadoop MapReduce job managed by Crunch.
MRJob.State - Enum in org.apache.crunch.impl.mr: A job will be in one of the following states.
MRPipeline - Class in org.apache.crunch.impl.mr: Pipeline implementation that is executed within Hadoop MapReduce.
MRPipeline(Class<?>) - Constructor for class org.apache.crunch.impl.mr.MRPipeline: Instantiate with a default Configuration and name.
MRPipeline(Class<?>, String) - Constructor for class org.apache.crunch.impl.mr.MRPipeline: Instantiate with a custom pipeline name.
MRPipeline(Class<?>, Configuration) - Constructor for class org.apache.crunch.impl.mr.MRPipeline: Instantiate with a custom configuration and default naming.
MRPipeline(Class<?>, String, Configuration) - Constructor for class org.apache.crunch.impl.mr.MRPipeline: Instantiate with a custom name and configuration.
MRPipelineExecution - Interface in org.apache.crunch.impl.mr

N

name - Variable in class org.apache.crunch.contrib.io.jdbc.IdentifiableName
name(String) - Static method in class org.apache.crunch.CreateOptions
name - Variable in class org.apache.crunch.test.Employee: Deprecated.
name - Variable in class org.apache.crunch.test.Person: Deprecated.
nameAndParallelism(String, int) - Static method in class org.apache.crunch.CreateOptions
named(String) - Method in class org.apache.crunch.PipelineCallable: Use the given name to identify this instance in the logs.
namedTuples(String, String[], PType[]) - Static method in class org.apache.crunch.types.avro.Avros
negateCounts(PTable<K, Long>) - Static method in class org.apache.crunch.lib.TopList: When creating toplists, it is often required to sort by count descending.
newBuilder() - Static method in class org.apache.crunch.kafka.offset.hdfs.Offsets.Builder: Creates a new Builder instance.
newBuilder() - Static method in class org.apache.crunch.kafka.offset.hdfs.Offsets.PartitionOffset.Builder: Creates a new builder instance.
newBuilder() - Static method in class org.apache.crunch.test.Employee: Creates a new Employee RecordBuilder
newBuilder(Employee.Builder) - Static method in class org.apache.crunch.test.Employee: Creates a new Employee RecordBuilder by copying an existing Builder
newBuilder(Employee) - Static method in class org.apache.crunch.test.Employee: Creates a new Employee RecordBuilder by copying an existing Employee instance
newBuilder() - Static method in class org.apache.crunch.test.Person: Creates a new Person RecordBuilder
newBuilder(Person.Builder) - Static method in class org.apache.crunch.test.Person: Creates a new Person RecordBuilder by copying an existing Builder
newBuilder(Person) - Static method in class org.apache.crunch.test.Person: Creates a new Person RecordBuilder by copying an existing Person instance
newReader(Schema) - Static method in class org.apache.crunch.types.avro.Avros
newReader(AvroType<T>) - Static method in class org.apache.crunch.types.avro.Avros
newWriter(Schema) - Static method in class org.apache.crunch.types.avro.Avros
newWriter(AvroType<T>) - Static method in class org.apache.crunch.types.avro.Avros
next() - Method in class org.apache.crunch.contrib.text.Tokenizer: Advance this Tokenizer and return the next String from the Scanner.
next() - Method in class org.apache.crunch.util.DoFnIterator
nextBoolean() - Method in class org.apache.crunch.contrib.text.Tokenizer: Advance this Tokenizer and return the next Boolean from the Scanner.
nextDouble() - Method in class org.apache.crunch.contrib.text.Tokenizer: Advance this Tokenizer and return the next Double from the Scanner.
nextFloat() - Method in class org.apache.crunch.contrib.text.Tokenizer: Advance this Tokenizer and return the next Float from the Scanner.
nextInt() - Method in class org.apache.crunch.contrib.text.Tokenizer: Advance this Tokenizer and return the next Integer from the Scanner.
nextKeyValue() - Method in class org.apache.crunch.kafka.inputformat.KafkaRecordReader
nextLong() - Method in class org.apache.crunch.contrib.text.Tokenizer: Advance this Tokenizer and return the next Long from the Scanner.
none() - Static method in class org.apache.crunch.CreateOptions
NoOpDeepCopier<T> - Class in org.apache.crunch.types: A DeepCopier that does nothing, and just returns the input value without copying anything.
not(FilterFn<S>) - Static method in class org.apache.crunch.fn.FilterFns: Accept an entry if the given filter does not accept it.
nulls() - Static method in class org.apache.crunch.types.avro.Avros
nulls() - Method in class org.apache.crunch.types.avro.AvroTypeFamily
nulls() - Method in interface org.apache.crunch.types.PTypeFamily
nulls() - Static method in class org.apache.crunch.types.writable.Writables
nulls() - Method in class org.apache.crunch.types.writable.WritableTypeFamily
numPartitions() - Method in class org.apache.crunch.impl.spark.SparkPartitioner
numReducers(int) - Method in class org.apache.crunch.GroupingOptions.Builder

O

of(T, U) - Static method in class org.apache.crunch.Pair
of(A, B, C) - Static method in class org.apache.crunch.Tuple3
of(A, B, C, D) - Static method in class org.apache.crunch.Tuple4
of(Object...) - Static method in class org.apache.crunch.TupleN
OffsetReader - Interface in org.apache.crunch.kafka.offset: Reader API that supports reading offset information from an underlying storage mechanism.
Offsets - Class in org.apache.crunch.kafka.offset.hdfs: Simple object to represent a collection of Kafka Topic and Partition offset information to make storing this information easier.
Offsets.Builder - Class in org.apache.crunch.kafka.offset.hdfs: Builder for the Offsets.
Offsets.PartitionOffset - Class in org.apache.crunch.kafka.offset.hdfs: Simple object that represents a specific topic, partition, and its offset value.
Offsets.PartitionOffset.Builder - Class in org.apache.crunch.kafka.offset.hdfs: Builder for Offsets.PartitionOffset
OffsetWriter - Interface in org.apache.crunch.kafka.offset: Writer for persisting offset information.
OneToManyJoin - Class in org.apache.crunch.lib.join: Optimized join for situations where exactly one value is being joined with any other number of values based on a common key.
OneToManyJoin() - Constructor for class org.apache.crunch.lib.join.OneToManyJoin
oneToManyJoin(PTable<K, U>, PTable<K, V>, DoFn<Pair<U, Iterable<V>>, T>, PType<T>) - Static method in class org.apache.crunch.lib.join.OneToManyJoin: Performs a join on two tables, where the left table only contains a single value per key.
oneToManyJoin(PTable<K, U>, PTable<K, V>, DoFn<Pair<U, Iterable<V>>, T>, PType<T>, int) - Static method in class org.apache.crunch.lib.join.OneToManyJoin: Supports a user-specified number of reducers for the one-to-many join.
or(FilterFn<S>, FilterFn<S>) - Static method in class org.apache.crunch.fn.FilterFns: Accept an entry if at least one of the given filters accept it, using short-circuit evaluation.
or(FilterFn<S>...) - Static method in class org.apache.crunch.fn.FilterFns: Accept an entry if at least one of the given filters accept it, using short-circuit evaluation.
Orcs - Class in org.apache.crunch.types.orc: Utilities to create PTypes for ORC serialization / deserialization
Orcs() - Constructor for class org.apache.crunch.types.orc.Orcs
orcs(TypeInfo) - Static method in class org.apache.crunch.types.orc.Orcs: Create a PType to directly use OrcStruct as the deserialized format.
OrcUtils - Class in org.apache.crunch.types.orc
OrcUtils() - Constructor for class org.apache.crunch.types.orc.OrcUtils
order() - Method in class org.apache.crunch.lib.Sort.ColumnOrder
org.apache.crunch - package org.apache.crunch: Client-facing API and core abstractions.
org.apache.crunch.contrib - package org.apache.crunch.contrib: User contributions that may be interesting for special applications.
org.apache.crunch.contrib.bloomfilter - package org.apache.crunch.contrib.bloomfilter: Support for creating Bloom Filters.
org.apache.crunch.contrib.io.jdbc - package org.apache.crunch.contrib.io.jdbc: Support for reading data from RDBMS using JDBC
org.apache.crunch.contrib.text - package org.apache.crunch.contrib.text
org.apache.crunch.examples - package org.apache.crunch.examples: Example applications demonstrating various aspects of Crunch.
org.apache.crunch.fn - package org.apache.crunch.fn: Commonly used functions for manipulating collections.
org.apache.crunch.impl - package org.apache.crunch.impl
org.apache.crunch.impl.dist - package org.apache.crunch.impl.dist
org.apache.crunch.impl.dist.collect - package org.apache.crunch.impl.dist.collect
org.apache.crunch.impl.mem - package org.apache.crunch.impl.mem: In-memory Pipeline implementation for rapid prototyping and testing.
org.apache.crunch.impl.mr - package org.apache.crunch.impl.mr: A Pipeline implementation that runs on Hadoop MapReduce.
org.apache.crunch.impl.spark - package org.apache.crunch.impl.spark
org.apache.crunch.impl.spark.collect - package org.apache.crunch.impl.spark.collect
org.apache.crunch.impl.spark.fn - package org.apache.crunch.impl.spark.fn
org.apache.crunch.impl.spark.serde - package org.apache.crunch.impl.spark.serde
org.apache.crunch.io - package org.apache.crunch.io: Data input and output for Pipelines.
org.apache.crunch.kafka - package org.apache.crunch.kafka
org.apache.crunch.kafka.inputformat - package org.apache.crunch.kafka.inputformat
org.apache.crunch.kafka.offset - package org.apache.crunch.kafka.offset
org.apache.crunch.kafka.offset.hdfs - package org.apache.crunch.kafka.offset.hdfs
org.apache.crunch.lambda - package org.apache.crunch.lambda: Alternative Crunch API using Java 8 features to allow construction of pipelines using lambda functions and method references.
org.apache.crunch.lambda.fn - package org.apache.crunch.lambda.fn: Serializable versions of the functional interfaces that ship with Java 8
org.apache.crunch.lib - package org.apache.crunch.lib: Joining, sorting, aggregating, and other commonly used functionality.
org.apache.crunch.lib.join - package org.apache.crunch.lib.join: Inner and outer joins on collections.
org.apache.crunch.lib.sort - package org.apache.crunch.lib.sort
org.apache.crunch.test - package org.apache.crunch.test: Utilities for testing Crunch-based applications.
org.apache.crunch.types - package org.apache.crunch.types: Common functionality for business object serialization.
org.apache.crunch.types.avro - package org.apache.crunch.types.avro: Business object serialization using Apache Avro.
org.apache.crunch.types.orc - package org.apache.crunch.types.orc
org.apache.crunch.types.writable - package org.apache.crunch.types.writable: Business object serialization using Hadoop's Writables framework.
org.apache.crunch.util - package org.apache.crunch.util: An assorted set of utilities.
org.apache.hadoop.mapred - package org.apache.hadoop.mapred
outputConf(String, String) - Method in interface org.apache.crunch.Target: Adds the given key-value pair to the Configuration instance that is used to write this Target.
OutputConfig(FormatBundle<OutputFormat<K, V>>, Class<K>, Class<V>) - Constructor for class org.apache.crunch.io.CrunchOutputs.OutputConfig
OutputConverterFunction<K,V,S> - Class in org.apache.crunch.impl.spark.fn
OutputConverterFunction(Converter<K, V, S, ?>) - Constructor for class org.apache.crunch.impl.spark.fn.OutputConverterFunction
OutputHandler - Interface in org.apache.crunch.io
outputKey(S) - Method in interface org.apache.crunch.types.Converter
outputValue(S) - Method in interface org.apache.crunch.types.Converter
override(ReaderWriterFactory) - Method in class org.apache.crunch.types.avro.AvroMode: Deprecated.
use AvroMode.withFactory(ReaderWriterFactory) instead.
overridePathProperties(Configuration) - Method in class org.apache.crunch.test.TemporaryPath: Set all keys specified in the constructor to temporary directories.

P

Pair<K,V> - Class in org.apache.crunch: A convenience class for two-element Tuples.
Pair(K, V) - Constructor for class org.apache.crunch.Pair
PAIR - Static variable in class org.apache.crunch.types.TupleFactory
pair2tupleFunc() - Static method in class org.apache.crunch.impl.spark.GuavaUtils
pairAggregator(Aggregator<V1>, Aggregator<V2>) - Static method in class org.apache.crunch.fn.Aggregators: Apply separate aggregators to each component of a Pair.
PairFlatMapDoFn<T,K,V> - Class in org.apache.crunch.impl.spark.fn
PairFlatMapDoFn(DoFn<T, Pair<K, V>>, SparkRuntimeContext) - Constructor for class org.apache.crunch.impl.spark.fn.PairFlatMapDoFn
PairIterable(Iterable<S>, Iterable<T>) - Constructor for class org.apache.crunch.util.Tuples.PairIterable
PairIterableMapFn(MapFn<Object, K>, MapFn<Object, V>) - Constructor for class org.apache.crunch.types.PGroupedTableType.PairIterableMapFn
PairMapFn<K,V,S,T> - Class in org.apache.crunch.fn
PairMapFn(MapFn<K, S>, MapFn<V, T>) - Constructor for class org.apache.crunch.fn.PairMapFn
PairMapFunction<K,V,S> - Class in org.apache.crunch.impl.spark.fn
PairMapFunction(MapFn<Pair<K, V>, S>, SparkRuntimeContext) - Constructor for class org.apache.crunch.impl.spark.fn.PairMapFunction
PairMapIterableFunction<K,V,S,T> - Class in org.apache.crunch.impl.spark.fn
PairMapIterableFunction(MapFn<Pair<K, List<V>>, Pair<S, Iterable<T>>>, SparkRuntimeContext) - Constructor for class org.apache.crunch.impl.spark.fn.PairMapIterableFunction
pairs(PType<V1>, PType<V2>) - Static method in class org.apache.crunch.types.avro.Avros
pairs(PType<V1>, PType<V2>) - Method in class org.apache.crunch.types.avro.AvroTypeFamily
pairs(PType<V1>, PType<V2>) - Method in interface org.apache.crunch.types.PTypeFamily
pairs(PType<V1>, PType<V2>) - Static method in class org.apache.crunch.types.writable.Writables
pairs(PType<V1>, PType<V2>) - Method in class org.apache.crunch.types.writable.WritableTypeFamily
PairValueComparator(boolean) - Constructor for class org.apache.crunch.lib.Aggregate.PairValueComparator
parallelDo(DoFn<S, T>, PType<T>) - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
parallelDo(String, DoFn<S, T>, PType<T>) - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
parallelDo(String, DoFn<S, T>, PType<T>, ParallelDoOptions) - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
parallelDo(DoFn<S, Pair<K, V>>, PTableType<K, V>) - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
parallelDo(String, DoFn<S, Pair<K, V>>, PTableType<K, V>) - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
parallelDo(String, DoFn<S, Pair<K, V>>, PTableType<K, V>, ParallelDoOptions) - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
parallelDo(DoFn<S, T>, PType<T>) - Method in interface org.apache.crunch.lambda.LCollection: Transform this LCollection using a standard Crunch DoFn
parallelDo(DoFn<S, Pair<K, V>>, PTableType<K, V>) - Method in interface org.apache.crunch.lambda.LCollection: Transform this LCollection to an LTable using a standard Crunch DoFn
parallelDo(LDoFn<S, T>, PType<T>) - Method in interface org.apache.crunch.lambda.LCollection: Transform this LCollection using a Lambda-friendly LDoFn.
parallelDo(LDoFn<S, Pair<K, V>>, PTableType<K, V>) - Method in interface org.apache.crunch.lambda.LCollection: Transform this LCollection using a Lambda-friendly LDoFn.
parallelDo(DoFn<S, T>, PType<T>) - Method in interface org.apache.crunch.PCollection: Applies the given doFn to the elements of this PCollection and returns a new PCollection that is the output of this processing.
parallelDo(String, DoFn<S, T>, PType<T>) - Method in interface org.apache.crunch.PCollection: Applies the given doFn to the elements of this PCollection and returns a new PCollection that is the output of this processing.
parallelDo(String, DoFn<S, T>, PType<T>, ParallelDoOptions) - Method in interface org.apache.crunch.PCollection: Applies the given doFn to the elements of this PCollection and returns a new PCollection that is the output of this processing.
parallelDo(DoFn<S, Pair<K, V>>, PTableType<K, V>) - Method in interface org.apache.crunch.PCollection: Similar to the other parallelDo instance, but returns a PTable instance instead of a PCollection.
parallelDo(String, DoFn<S, Pair<K, V>>, PTableType<K, V>) - Method in interface org.apache.crunch.PCollection: Similar to the other parallelDo instance, but returns a PTable instance instead of a PCollection.
parallelDo(String, DoFn<S, Pair<K, V>>, PTableType<K, V>, ParallelDoOptions) - Method in interface org.apache.crunch.PCollection: Similar to the other parallelDo instance, but returns a PTable instance instead of a PCollection.
ParallelDoOptions - Class in org.apache.crunch: Container class that includes optional information about a parallelDo operation applied to a PCollection.
ParallelDoOptions.Builder - Class in org.apache.crunch
parallelism(int) - Static method in class org.apache.crunch.CreateOptions
Parse - Class in org.apache.crunch.contrib.text: Methods for parsing instances of PCollection<String> into PCollection's of strongly-typed tuples.
parse(String, PCollection<String>, Extractor<T>) - Static method in class org.apache.crunch.contrib.text.Parse: Parses the lines of the input PCollection<String> and returns a PCollection<T> using the given Extractor<T>.
parse(String, PCollection<String>, PTypeFamily, Extractor<T>) - Static method in class org.apache.crunch.contrib.text.Parse: Parses the lines of the input PCollection<String> and returns a PCollection<T> using the given Extractor<T> that uses the given PTypeFamily.
parseTable(String, PCollection<String>, Extractor<Pair<K, V>>) - Static method in class org.apache.crunch.contrib.text.Parse: Parses the lines of the input PCollection<String> and returns a PTable<K, V> using the given Extractor<Pair<K, V>>.
parseTable(String, PCollection<String>, PTypeFamily, Extractor<Pair<K, V>>) - Static method in class org.apache.crunch.contrib.text.Parse: Parses the lines of the input PCollection<String> and returns a PTable<K, V> using the given Extractor<Pair<K, V>> that uses the given PTypeFamily.
partition - Variable in class org.apache.crunch.impl.spark.IntByteArray
PartitionedMapOutputFunction<K,V> - Class in org.apache.crunch.impl.spark.fn
PartitionedMapOutputFunction(SerDe<K>, SerDe<V>, PGroupedTableType<K, V>, int, GroupingOptions, SparkRuntimeContext) - Constructor for class org.apache.crunch.impl.spark.fn.PartitionedMapOutputFunction
PARTITIONER_PATH - Static variable in class org.apache.crunch.lib.sort.TotalOrderPartitioner
partitionerClass(Class<? extends Partitioner>) - Method in class org.apache.crunch.GroupingOptions.Builder
PartitionUtils - Class in org.apache.crunch.util: Helper functions and settings for determining the number of reducers to use in a pipeline job created by the Crunch planner.
PartitionUtils() - Constructor for class org.apache.crunch.util.PartitionUtils
PathTarget - Interface in org.apache.crunch.io: A target whose output goes to a given path on a file system.
PCollection<S> - Interface in org.apache.crunch: A representation of an immutable, distributed collection of elements that is the fundamental target of computations in Crunch.
PCollectionFactory - Interface in org.apache.crunch.impl.dist.collect
PCollectionImpl<S> - Class in org.apache.crunch.impl.dist.collect
PCollectionImpl(String, DistributedPipeline) - Constructor for class org.apache.crunch.impl.dist.collect.PCollectionImpl
PCollectionImpl(String, DistributedPipeline, ParallelDoOptions) - Constructor for class org.apache.crunch.impl.dist.collect.PCollectionImpl
PCollectionImpl.Visitor - Interface in org.apache.crunch.impl.dist.collect
PERSIST_TIME_FORMAT - Static variable in class org.apache.crunch.kafka.offset.hdfs.HDFSOffsetWriter: Custom formatter for translating the times into valid file names.
persistenceTimeToFileName(long) - Static method in class org.apache.crunch.kafka.offset.hdfs.HDFSOffsetWriter: Converts a persistedTime into a file name for persisting the offsets.
Person - Class in org.apache.crunch.test
Person() - Constructor for class org.apache.crunch.test.Person: Default constructor.
Person(CharSequence, Integer, List<CharSequence>) - Constructor for class org.apache.crunch.test.Person: All-args constructor.
Person.Builder - Class in org.apache.crunch.test: RecordBuilder for Person instances.
PGroupedTable<K,V> - Interface in org.apache.crunch: The Crunch representation of a grouped PTable, which corresponds to the output of the shuffle phase of a MapReduce job.
PGroupedTableImpl<K,V> - Class in org.apache.crunch.impl.spark.collect
PGroupedTableType<K,V> - Class in org.apache.crunch.types: The PType instance for PGroupedTable instances.
PGroupedTableType(PTableType<K, V>) - Constructor for class org.apache.crunch.types.PGroupedTableType
PGroupedTableType.PairIterableMapFn<K,V> - Class in org.apache.crunch.types
Pipeline - Interface in org.apache.crunch: Manages the state of a pipeline execution.
PipelineCallable<Output> - Class in org.apache.crunch: A specialization of Callable that executes some sequential logic on the client machine as part of an overall Crunch pipeline in order to generate zero or more outputs, some of which may be PCollection instances that are processed by other jobs in the pipeline.
PipelineCallable() - Constructor for class org.apache.crunch.PipelineCallable
PipelineCallable.Status - Enum in org.apache.crunch
PipelineExecution - Interface in org.apache.crunch: A handle to allow clients to control a Crunch pipeline as it runs.
PipelineExecution.Status - Enum in org.apache.crunch
PipelineResult - Class in org.apache.crunch: Container for the results of a call to run or done on the Pipeline interface that includes details and statistics about the component stages of the data pipeline.
PipelineResult(List<PipelineResult.StageResult>, PipelineExecution.Status) - Constructor for class org.apache.crunch.PipelineResult
PipelineResult.StageResult - Class in org.apache.crunch
plan() - Method in class org.apache.crunch.impl.mr.MRPipeline
PObject<T> - Interface in org.apache.crunch: A PObject represents a singleton object value that results from a distributed computation.
process(S, Emitter<Pair<String, BloomFilter>>) - Method in class org.apache.crunch.contrib.bloomfilter.BloomFilterFn
process(S, Emitter<T>) - Method in class org.apache.crunch.DoFn: Processes the records from a PCollection.
process(T, Emitter<T>) - Method in class org.apache.crunch.FilterFn
process(T, Emitter<Double>) - Method in class org.apache.crunch.fn.SDoubleFlatMapFunction
process(T, Emitter<R>) - Method in class org.apache.crunch.fn.SFlatMapFunction
process(Pair<K, V>, Emitter<R>) - Method in class org.apache.crunch.fn.SFlatMapFunction2
process(T, Emitter<Pair<K, V>>) - Method in class org.apache.crunch.fn.SPairFlatMapFunction
process(LDoFnContext<S, T>) - Method in interface org.apache.crunch.lambda.LDoFn
process(Pair<Integer, Iterable<Pair<K, V>>>, Emitter<Pair<Integer, Pair<K, V>>>) - Method in class org.apache.crunch.lib.Aggregate.TopKCombineFn
process(Pair<K, V>, Emitter<Pair<Integer, Pair<K, V>>>) - Method in class org.apache.crunch.lib.Aggregate.TopKFn
process(Pair<Pair<K, Integer>, Iterable<Pair<U, V>>>, Emitter<Pair<K, Pair<U, V>>>) - Method in class org.apache.crunch.lib.join.JoinFn: Split up the input record to make coding a bit more manageable.
process(S, Emitter<T>) - Method in class org.apache.crunch.MapFn
Protos - Class in org.apache.crunch.types: Utility functions for working with protocol buffers in Crunch.
Protos() - Constructor for class org.apache.crunch.types.Protos
protos(Class<T>, PTypeFamily) - Static method in class org.apache.crunch.types.PTypes: Constructs a PType for the given protocol buffer.
protos(Class<T>, PTypeFamily, SerializableSupplier<ExtensionRegistry>) - Static method in class org.apache.crunch.types.PTypes: Constructs a PType for a protocol buffer, using the given SerializableSupplier to provide an ExtensionRegistry to use in reading the given protobuf.
PTable<K,V> - Interface in org.apache.crunch: A sub-interface of PCollection that represents an immutable, distributed multi-map of keys and values.
PTableBase<K,V> - Class in org.apache.crunch.impl.dist.collect
PTableBase(String, DistributedPipeline) - Constructor for class org.apache.crunch.impl.dist.collect.PTableBase
PTableBase(String, DistributedPipeline, ParallelDoOptions) - Constructor for class org.apache.crunch.impl.dist.collect.PTableBase
PTables - Class in org.apache.crunch.lib: Methods for performing common operations on PTables.
PTables() - Constructor for class org.apache.crunch.lib.PTables
PTableType<K,V> - Interface in org.apache.crunch.types: An extension of PType specifically for PTable objects.
ptf() - Method in interface org.apache.crunch.lambda.LCollection: Get the PTypeFamily representing how elements of this collection may be serialized.
ptype(PType<Pair<V1, V2>>) - Static method in class org.apache.crunch.fn.SwapFn
pType() - Method in interface org.apache.crunch.lambda.LCollection: Get the PType representing how elements of this collection may be serialized.
pType() - Method in interface org.apache.crunch.lambda.LTable: Get the underlying PTableType used to serialize key/value pairs in this table
pType(PType<V>) - Static method in class org.apache.crunch.lib.Quantiles.Result: Create a PType for the result type, to be stored as a derived type from Crunch primitives
PType<T> - Interface in org.apache.crunch.types: A PType defines a mapping between a data type that is used in a Crunch pipeline and a serialization and storage format that is used to read/write data from/to HDFS.
PTypeFamily - Interface in org.apache.crunch.types: An abstract factory for creating PType instances that have the same serialization/storage backing format.
PTypes - Class in org.apache.crunch.types: Utility functions for creating common types of derived PTypes, e.g., for JSON data, protocol buffers, and Thrift records.
PTypes() - Constructor for class org.apache.crunch.types.PTypes
PTypeUtils - Class in org.apache.crunch.types: Utilities for converting between PTypes from different PTypeFamily implementations.
put(int, Object) - Method in class org.apache.crunch.test.Employee
put(int, Object) - Method in class org.apache.crunch.test.Person

Q

quadAggregator(Aggregator<V1>, Aggregator<V2>, Aggregator<V3>, Aggregator<V4>) - Static method in class org.apache.crunch.fn.Aggregators: Apply separate aggregators to each component of a Tuple4.
QuadIterable(Iterable<A>, Iterable, Iterable<C>, Iterable<D>) - Constructor for class org.apache.crunch.util.Tuples.QuadIterable
quads(PType<V1>, PType<V2>, PType<V3>, PType<V4>) - Static method in class org.apache.crunch.types.avro.Avros
quads(PType<V1>, PType<V2>, PType<V3>, PType<V4>) - Method in class org.apache.crunch.types.avro.AvroTypeFamily
quads(PType<V1>, PType<V2>, PType<V3>, PType<V4>) - Method in interface org.apache.crunch.types.PTypeFamily
quads(PType<V1>, PType<V2>, PType<V3>, PType<V4>) - Static method in class org.apache.crunch.types.writable.Writables
quads(PType<V1>, PType<V2>, PType<V3>, PType<V4>) - Method in class org.apache.crunch.types.writable.WritableTypeFamily
Quantiles - Class in org.apache.crunch.lib
Quantiles() - Constructor for class org.apache.crunch.lib.Quantiles
quantiles - Variable in class org.apache.crunch.lib.Quantiles.Result
Quantiles.Result<V> - Class in org.apache.crunch.lib: Output type for storing the results of a Quantiles computation

R

read(Source<S>) - Method in class org.apache.crunch.impl.dist.DistributedPipeline
read(Source<S>, String) - Method in class org.apache.crunch.impl.dist.DistributedPipeline
read(TableSource<K, V>) - Method in class org.apache.crunch.impl.dist.DistributedPipeline
read(TableSource<K, V>, String) - Method in class org.apache.crunch.impl.dist.DistributedPipeline
read(Source<T>) - Method in class org.apache.crunch.impl.mem.MemPipeline
read(Source<T>, String) - Method in class org.apache.crunch.impl.mem.MemPipeline
read(TableSource<K, V>) - Method in class org.apache.crunch.impl.mem.MemPipeline
read(TableSource<K, V>, String) - Method in class org.apache.crunch.impl.mem.MemPipeline
read(FileSystem, Path) - Method in interface org.apache.crunch.io.FileReaderFactory
read(Configuration) - Method in interface org.apache.crunch.io.ReadableSource: Returns an Iterable that contains the contents of this source.
read(Configuration) - Method in class org.apache.crunch.kafka.KafkaSource
read(Source<T>) - Method in interface org.apache.crunch.Pipeline: Converts the given Source into a PCollection that is available to jobs run using this Pipeline instance.
read(Source<T>, String) - Method in interface org.apache.crunch.Pipeline: Converts the given Source into a PCollection that is available to jobs run using this Pipeline instance.
read(TableSource<K, V>) - Method in interface org.apache.crunch.Pipeline: A version of the read method for TableSource instances that map to PTables.
read(TableSource<K, V>, String) - Method in interface org.apache.crunch.Pipeline: A version of the read method for TableSource instances that map to PTables.
read(TaskInputOutputContext<?, ?, ?, ?>) - Method in interface org.apache.crunch.ReadableData: Read the data referenced by this instance within the given context.
read(Source<T>) - Method in class org.apache.crunch.util.CrunchTool
read(TableSource<K, V>) - Method in class org.apache.crunch.util.CrunchTool
read(TaskInputOutputContext<?, ?, ?, ?>) - Method in class org.apache.crunch.util.DelegatingReadableData
read(Configuration, Path) - Static method in class org.apache.crunch.util.DistCache
read(TaskInputOutputContext<?, ?, ?, ?>) - Method in class org.apache.crunch.util.UnionReadableData
ReadableData<T> - Interface in org.apache.crunch: Represents the contents of a data source that can be read on the cluster from within one of the tasks running as part of a Crunch pipeline.
ReadableSource<T> - Interface in org.apache.crunch.io: An extension of the Source interface that indicates that a Source instance may be read as a series of records by the client code.
ReadableSourceTarget<T> - Interface in org.apache.crunch.io: An interface that indicates that a SourceTarget instance can be read into the local client.
ReaderWriterFactory - Interface in org.apache.crunch.types.avro: Interface for accessing DatumReader, DatumWriter, and Data classes.
readFields(DataInput) - Method in class org.apache.crunch.contrib.io.jdbc.IdentifiableName
readFields(ResultSet) - Method in class org.apache.crunch.contrib.io.jdbc.IdentifiableName
readFields(DataInput) - Method in class org.apache.crunch.io.FormatBundle
readFields(DataInput) - Method in class org.apache.crunch.kafka.inputformat.KafkaInputSplit
readFields(DataInput) - Method in class org.apache.crunch.types.writable.TupleWritable
readFields(DataInput) - Method in class org.apache.crunch.types.writable.UnionWritable
readLatestOffsets() - Method in class org.apache.crunch.kafka.offset.hdfs.HDFSOffsetReader
readLatestOffsets() - Method in interface org.apache.crunch.kafka.offset.OffsetReader: Reads the last stored offsets.
readOffsets(long) - Method in class org.apache.crunch.kafka.offset.AbstractOffsetReader
readOffsets(long) - Method in class org.apache.crunch.kafka.offset.hdfs.HDFSOffsetReader
readOffsets(long) - Method in interface org.apache.crunch.kafka.offset.OffsetReader: Reads the offsets for a given persistedOffsetTime.
readTextFile(String) - Method in class org.apache.crunch.impl.dist.DistributedPipeline
readTextFile(String) - Method in class org.apache.crunch.impl.mem.MemPipeline
readTextFile(String) - Method in interface org.apache.crunch.Pipeline: A convenience method for reading a text file.
readTextFile(String) - Method in class org.apache.crunch.util.CrunchTool
records(Class<T>) - Static method in class org.apache.crunch.types.avro.Avros
records(Class<T>) - Method in class org.apache.crunch.types.avro.AvroTypeFamily
records(Class<T>) - Method in interface org.apache.crunch.types.PTypeFamily
records(Class<T>) - Static method in class org.apache.crunch.types.writable.Writables
records(Class<T>) - Method in class org.apache.crunch.types.writable.WritableTypeFamily
reduce(PGroupedTable<K1, V1>, Class<? extends Reducer<K1, V1, K2, V2>>, Class<K2>, Class<V2>) - Static method in class org.apache.crunch.lib.Mapred
reduce(PGroupedTable<K1, V1>, Class<? extends Reducer<K1, V1, K2, V2>>, Class<K2>, Class<V2>) - Static method in class org.apache.crunch.lib.Mapreduce
ReduceGroupingFunction - Class in org.apache.crunch.impl.spark.fn
ReduceGroupingFunction(GroupingOptions, PGroupedTableType, SparkRuntimeContext) - Constructor for class org.apache.crunch.impl.spark.fn.ReduceGroupingFunction
ReduceInputFunction<K,V> - Class in org.apache.crunch.impl.spark.fn
ReduceInputFunction(SerDe<K>, SerDe<V>) - Constructor for class org.apache.crunch.impl.spark.fn.ReduceInputFunction
reduceValues(SBinaryOperator<V>) - Method in interface org.apache.crunch.lambda.LGroupedTable: Reduce the values for each key using the an associative binary operator.
REFLECT - Static variable in class org.apache.crunch.types.avro.AvroMode: Default mode to use for reading and writing Reflect types.
REFLECT_DATA_FACTORY - Static variable in class org.apache.crunch.types.avro.Avros: Deprecated.
as of 0.9.0; use AvroMode.REFLECT.override(ReaderWriterFactory)
REFLECT_DATA_FACTORY_CLASS - Static variable in class org.apache.crunch.types.avro.Avros: The name of the configuration parameter that tracks which reflection factory to use.
ReflectDataFactory - Class in org.apache.crunch.types.avro: A Factory class for constructing Avro reflection-related objects.
ReflectDataFactory() - Constructor for class org.apache.crunch.types.avro.ReflectDataFactory
reflects(Class<T>) - Static method in class org.apache.crunch.types.avro.Avros
reflects(Class<T>, Schema) - Static method in class org.apache.crunch.types.avro.Avros
reflects(Class<T>) - Static method in class org.apache.crunch.types.orc.Orcs: Create a PType which uses reflection to serialize/deserialize java POJOs to/from ORC.
register(Class<T>, AvroType<T>) - Static method in class org.apache.crunch.types.avro.Avros
register(Class<T>, WritableType<T, ? extends Writable>) - Static method in class org.apache.crunch.types.writable.Writables
registerComparable(Class<? extends WritableComparable>) - Static method in class org.apache.crunch.types.writable.Writables: Registers a WritableComparable class so that it can be used for comparing the fields inside of tuple types (e.g., pairs, trips, tupleN, etc.) for use in sorts and secondary sorts.
registerComparable(Class<? extends WritableComparable>, int) - Static method in class org.apache.crunch.types.writable.Writables: Registers a WritableComparable class with a given integer code to use for serializing and deserializing instances of this class that are defined inside of tuple types (e.g., pairs, trips, tupleN, etc.) Unregistered Writables are always serialized to bytes and cannot be used in comparisons (e.g., sorts and secondary sorts) according to their underlying types.
REJECT_ALL() - Static method in class org.apache.crunch.fn.FilterFns: Reject everything.
remove() - Method in class org.apache.crunch.util.DoFnIterator
replicas(int) - Method in class org.apache.crunch.CachingOptions.Builder
replicas() - Method in class org.apache.crunch.CachingOptions: Returns the number of replicas of the data that should be maintained in the cache.
requireSortedKeys() - Method in class org.apache.crunch.GroupingOptions.Builder
requireSortedKeys() - Method in class org.apache.crunch.GroupingOptions
reservoirSample(PCollection<T>, int) - Static method in class org.apache.crunch.lib.Sample: Select a fixed number of elements from the given PCollection with each element equally likely to be included in the sample.
reservoirSample(PCollection<T>, int, Long) - Static method in class org.apache.crunch.lib.Sample: A version of the reservoir sampling algorithm that uses a given seed, primarily for testing purposes.
reset() - Method in interface org.apache.crunch.Aggregator: Clears the internal state of this Aggregator and prepares it for the values associated with the next key.
reset() - Method in class org.apache.crunch.lambda.LAggregator
Result(long, Iterable<Pair<Double, V>>) - Constructor for class org.apache.crunch.lib.Quantiles.Result
results() - Method in interface org.apache.crunch.Aggregator: Returns the current aggregated state of this instance.
results() - Method in class org.apache.crunch.lambda.LAggregator
ReverseAvroComparator<T> - Class in org.apache.crunch.lib.sort
ReverseAvroComparator() - Constructor for class org.apache.crunch.lib.sort.ReverseAvroComparator
ReverseWritableComparator<T> - Class in org.apache.crunch.lib.sort
ReverseWritableComparator() - Constructor for class org.apache.crunch.lib.sort.ReverseWritableComparator
rightJoin(PTable<K, U>, PTable<K, V>) - Static method in class org.apache.crunch.lib.Join: Performs a right outer join on the specified PTables.
RightOuterJoinFn<K,U,V> - Class in org.apache.crunch.lib.join: Used to perform the last step of an right outer join.
RightOuterJoinFn(PType<K>, PType) - Constructor for class org.apache.crunch.lib.join.RightOuterJoinFn
run(String[]) - Method in class org.apache.crunch.examples.AverageBytesByIP
run(String[]) - Method in class org.apache.crunch.examples.SecondarySortExample
run(String[]) - Method in class org.apache.crunch.examples.SortExample
run(String[]) - Method in class org.apache.crunch.examples.TotalBytesByIP
run(String[]) - Method in class org.apache.crunch.examples.TotalWordCount
run(String[]) - Method in class org.apache.crunch.examples.WordAggregationHBase
run(String[]) - Method in class org.apache.crunch.examples.WordCount
run() - Method in class org.apache.crunch.impl.mem.MemPipeline
run() - Method in class org.apache.crunch.impl.mr.MRPipeline
run() - Method in class org.apache.crunch.impl.spark.SparkPipeline
run() - Method in interface org.apache.crunch.Pipeline: Constructs and executes a series of MapReduce jobs in order to write data to the output targets.
run() - Method in class org.apache.crunch.util.CrunchTool
runAsync() - Method in class org.apache.crunch.impl.mem.MemPipeline
runAsync() - Method in class org.apache.crunch.impl.mr.MRPipeline
runAsync() - Method in class org.apache.crunch.impl.spark.SparkPipeline
runAsync() - Method in interface org.apache.crunch.Pipeline: Constructs and starts a series of MapReduce jobs in order ot write data to the output targets, but returns a ListenableFuture to allow clients to control job execution.
runAsync() - Method in class org.apache.crunch.util.CrunchTool
runSingleThreaded() - Method in class org.apache.crunch.PipelineCallable: Override this method to indicate to the planner that this instance should not be run at the same time as any other PipelineCallable instances.

S

salary - Variable in class org.apache.crunch.test.Employee: Deprecated.
Sample - Class in org.apache.crunch.lib: Methods for performing random sampling in a distributed fashion, either by accepting each record in a PCollection with an independent probability in order to sample some fraction of the overall data set, or by using reservoir sampling in order to pull a uniform or weighted sample of fixed size from a PCollection of an unknown size.
Sample() - Constructor for class org.apache.crunch.lib.Sample
sample(PCollection<S>, double) - Static method in class org.apache.crunch.lib.Sample: Output records from the given PCollection with the given probability.
sample(PCollection<S>, Long, double) - Static method in class org.apache.crunch.lib.Sample: Output records from the given PCollection using a given seed.
sample(PTable<K, V>, double) - Static method in class org.apache.crunch.lib.Sample: A PTable<K, V> analogue of the sample function.
sample(PTable<K, V>, Long, double) - Static method in class org.apache.crunch.lib.Sample: A PTable<K, V> analogue of the sample function, with the seed argument exposed for testing purposes.
SAMPLE_UNIQUE_ELEMENTS(int) - Static method in class org.apache.crunch.fn.Aggregators: Collect a sample of unique elements from the input, where 'unique' is defined by the equals method for the input objects.
SBiConsumer<K,V> - Interface in org.apache.crunch.lambda.fn: Serializable version of the Java BiConsumer functional interface.
SBiFunction<K,V,T> - Interface in org.apache.crunch.lambda.fn: Serializable version of the Java BiFunction functional interface.
SBinaryOperator<T> - Interface in org.apache.crunch.lambda.fn: Serializable version of the Java BinaryOperator functional interface.
scaleFactor() - Method in class org.apache.crunch.DoFn: Returns an estimate of how applying this function to a PCollection will cause it to change in side.
scaleFactor() - Method in class org.apache.crunch.FilterFn
scaleFactor() - Method in class org.apache.crunch.fn.CompositeMapFn
scaleFactor() - Method in class org.apache.crunch.fn.ExtractKeyFn
scaleFactor() - Method in class org.apache.crunch.fn.PairMapFn
scaleFactor() - Method in class org.apache.crunch.MapFn
SCHEMA$ - Static variable in class org.apache.crunch.test.Employee
SCHEMA$ - Static variable in class org.apache.crunch.test.Person
SConsumer<T> - Interface in org.apache.crunch.lambda.fn: Serializable version of the Java Consumer functional interface.
SDoubleFlatMapFunction<T> - Class in org.apache.crunch.fn: A Crunch-compatible abstract base class for Spark's DoubleFlatMapFunction.
SDoubleFlatMapFunction() - Constructor for class org.apache.crunch.fn.SDoubleFlatMapFunction
SDoubleFunction<T> - Class in org.apache.crunch.fn: A Crunch-compatible abstract base class for Spark's DoubleFunction.
SDoubleFunction() - Constructor for class org.apache.crunch.fn.SDoubleFunction
second() - Method in class org.apache.crunch.Pair
second() - Method in class org.apache.crunch.Tuple3
second() - Method in class org.apache.crunch.Tuple4
SecondarySort - Class in org.apache.crunch.lib: Utilities for performing a secondary sort on a PTable<K, Pair<V1, V2>> collection.
SecondarySort() - Constructor for class org.apache.crunch.lib.SecondarySort
SecondarySortExample - Class in org.apache.crunch.examples
SecondarySortExample() - Constructor for class org.apache.crunch.examples.SecondarySortExample
sequenceFile(String, Class<T>) - Static method in class org.apache.crunch.io.At: Creates a SourceTarget<T> instance from the SequenceFile(s) at the given path name from the value field of each key-value pair in the SequenceFile(s).
sequenceFile(Path, Class<T>) - Static method in class org.apache.crunch.io.At: Creates a SourceTarget<T> instance from the SequenceFile(s) at the given Path from the value field of each key-value pair in the SequenceFile(s).
sequenceFile(String, PType<T>) - Static method in class org.apache.crunch.io.At: Creates a SourceTarget<T> instance from the SequenceFile(s) at the given path name from the value field of each key-value pair in the SequenceFile(s).
sequenceFile(Path, PType<T>) - Static method in class org.apache.crunch.io.At: Creates a SourceTarget<T> instance from the SequenceFile(s) at the given Path from the value field of each key-value pair in the SequenceFile(s).
sequenceFile(String, Class<K>, Class<V>) - Static method in class org.apache.crunch.io.At: Creates a TableSourceTarget<K, V> instance from the SequenceFile(s) at the given path name from the key-value pairs in the SequenceFile(s).
sequenceFile(Path, Class<K>, Class<V>) - Static method in class org.apache.crunch.io.At: Creates a TableSourceTarget<K, V> instance from the SequenceFile(s) at the given Path from the key-value pairs in the SequenceFile(s).
sequenceFile(String, PType<K>, PType<V>) - Static method in class org.apache.crunch.io.At: Creates a TableSourceTarget<K, V> instance from the SequenceFile(s) at the given path name from the key-value pairs in the SequenceFile(s).
sequenceFile(Path, PType<K>, PType<V>) - Static method in class org.apache.crunch.io.At: Creates a TableSourceTarget<K, V> instance from the SequenceFile(s) at the given Path from the key-value pairs in the SequenceFile(s).
sequenceFile(String, Class<T>) - Static method in class org.apache.crunch.io.From: Creates a Source<T> instance from the SequenceFile(s) at the given path name from the value field of each key-value pair in the SequenceFile(s).
sequenceFile(Path, Class<T>) - Static method in class org.apache.crunch.io.From: Creates a Source<T> instance from the SequenceFile(s) at the given Path from the value field of each key-value pair in the SequenceFile(s).
sequenceFile(List<Path>, Class<T>) - Static method in class org.apache.crunch.io.From: Creates a Source<T> instance from the SequenceFile(s) at the given Paths from the value field of each key-value pair in the SequenceFile(s).
sequenceFile(String, PType<T>) - Static method in class org.apache.crunch.io.From: Creates a Source<T> instance from the SequenceFile(s) at the given path name from the value field of each key-value pair in the SequenceFile(s).
sequenceFile(Path, PType<T>) - Static method in class org.apache.crunch.io.From: Creates a Source<T> instance from the SequenceFile(s) at the given Path from the value field of each key-value pair in the SequenceFile(s).
sequenceFile(List<Path>, PType<T>) - Static method in class org.apache.crunch.io.From: Creates a Source<T> instance from the SequenceFile(s) at the given Paths from the value field of each key-value pair in the SequenceFile(s).
sequenceFile(String, Class<K>, Class<V>) - Static method in class org.apache.crunch.io.From: Creates a TableSource<K, V> instance for the SequenceFile(s) at the given path name.
sequenceFile(Path, Class<K>, Class<V>) - Static method in class org.apache.crunch.io.From: Creates a TableSource<K, V> instance for the SequenceFile(s) at the given Path.
sequenceFile(List<Path>, Class<K>, Class<V>) - Static method in class org.apache.crunch.io.From: Creates a TableSource<K, V> instance for the SequenceFile(s) at the given Paths.
sequenceFile(String, PType<K>, PType<V>) - Static method in class org.apache.crunch.io.From: Creates a TableSource<K, V> instance for the SequenceFile(s) at the given path name.
sequenceFile(Path, PType<K>, PType<V>) - Static method in class org.apache.crunch.io.From: Creates a TableSource<K, V> instance for the SequenceFile(s) at the given Path.
sequenceFile(List<Path>, PType<K>, PType<V>) - Static method in class org.apache.crunch.io.From: Creates a TableSource<K, V> instance for the SequenceFile(s) at the given Paths.
sequenceFile(String) - Static method in class org.apache.crunch.io.To: Creates a Target at the given path name that writes data to SequenceFiles.
sequenceFile(Path) - Static method in class org.apache.crunch.io.To: Creates a Target at the given Path that writes data to SequenceFiles.
sequentialDo(String, PipelineCallable<Output>) - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
sequentialDo(PipelineCallable<Output>) - Method in class org.apache.crunch.impl.dist.DistributedPipeline
sequentialDo(PipelineCallable<Output>) - Method in class org.apache.crunch.impl.mem.MemPipeline
sequentialDo(String, PipelineCallable<Output>) - Method in interface org.apache.crunch.PCollection: Adds the materialized data in this PCollection as a dependency to the given PipelineCallable and registers it with the Pipeline associated with this instance.
sequentialDo(PipelineCallable<Output>) - Method in interface org.apache.crunch.Pipeline: Executes the given PipelineCallable on the client after the Targets that the PipelineCallable depends on (if any) have been created by other pipeline processing steps.
SequentialFileNamingScheme - Class in org.apache.crunch.io: Default FileNamingScheme that uses an incrementing sequence number in order to generate unique file names.
SerDe<T> - Interface in org.apache.crunch.impl.spark.serde
SerDeFactory - Class in org.apache.crunch.impl.spark.serde
SerDeFactory() - Constructor for class org.apache.crunch.impl.spark.serde.SerDeFactory
SerializableSupplier<T> - Interface in org.apache.crunch.util: An extension of Guava's Supplier interface that indicates that an instance will also implement Serializable, which makes this object suitable for use with Crunch's DoFns when we need to construct an instance of a non-serializable type for use in processing.
serialize() - Method in class org.apache.crunch.io.FormatBundle
set(String, String) - Method in class org.apache.crunch.io.FormatBundle
Set - Class in org.apache.crunch.lib: Utilities for performing set operations (difference, intersection, etc) on PCollection instances.
Set() - Constructor for class org.apache.crunch.lib.Set
set(int, Writable) - Method in class org.apache.crunch.types.writable.TupleWritable
setAge(int) - Method in class org.apache.crunch.test.Person.Builder: Sets the value of the 'age' field
setAge(Integer) - Method in class org.apache.crunch.test.Person: Sets the value of the 'age' field.
setAsOfTime(long) - Method in class org.apache.crunch.kafka.offset.hdfs.Offsets.Builder: Sets the as of time for the collection of offsets.
setBreakpoint() - Method in class org.apache.crunch.impl.dist.collect.BaseUnionCollection
setBreakpoint() - Method in class org.apache.crunch.impl.dist.collect.BaseUnionTable
setBreakpoint() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
setCombineFn(CombineFn) - Method in class org.apache.crunch.impl.spark.SparkRuntime
setConf(Broadcast<byte[]>) - Method in class org.apache.crunch.impl.spark.SparkRuntimeContext
setConf(Configuration) - Method in class org.apache.crunch.io.FormatBundle
setConf(Configuration) - Method in class org.apache.crunch.kafka.inputformat.KafkaInputFormat
setConf(Configuration) - Method in class org.apache.crunch.lib.join.JoinUtils.AvroPairGroupingComparator
setConf(Configuration) - Method in class org.apache.crunch.lib.sort.ReverseAvroComparator
setConf(Configuration) - Method in class org.apache.crunch.lib.sort.ReverseWritableComparator
setConf(Configuration) - Method in class org.apache.crunch.lib.sort.TotalOrderPartitioner
setConf(Configuration) - Method in class org.apache.crunch.lib.sort.TupleWritableComparator
setConf(Configuration) - Method in class org.apache.crunch.types.writable.TupleWritable.Comparator
setConf(Configuration) - Method in class org.apache.crunch.types.writable.TupleWritable
setConf(Configuration) - Method in class org.apache.crunch.util.CrunchTool
setConfiguration(Configuration) - Method in class org.apache.crunch.DoFn: Called during the setup of an initialized PType that relies on this instance.
setConfiguration(Configuration) - Method in class org.apache.crunch.fn.CompositeMapFn
setConfiguration(Configuration) - Method in class org.apache.crunch.fn.ExtractKeyFn
setConfiguration(Configuration) - Method in class org.apache.crunch.fn.PairMapFn
setConfiguration(Configuration) - Method in class org.apache.crunch.impl.dist.DistributedPipeline
setConfiguration(Configuration) - Method in class org.apache.crunch.impl.mem.MemPipeline
setConfiguration(Configuration) - Method in interface org.apache.crunch.Pipeline: Set the Configuration to use with this pipeline.
setContext(TaskInputOutputContext<?, ?, ?, ?>) - Method in class org.apache.crunch.DoFn: Called during setup to pass the TaskInputOutputContext to this DoFn instance.
setContext(TaskInputOutputContext<?, ?, ?, ?>) - Method in class org.apache.crunch.fn.CompositeMapFn
setContext(TaskInputOutputContext<?, ?, ?, ?>) - Method in class org.apache.crunch.fn.ExtractKeyFn
setContext(TaskInputOutputContext<?, ?, ?, ?>) - Method in class org.apache.crunch.fn.PairMapFn
setContext(TaskInputOutputContext<?, ?, ?, ?>) - Method in class org.apache.crunch.types.PGroupedTableType.PairIterableMapFn
setDepartment(CharSequence) - Method in class org.apache.crunch.test.Employee.Builder: Sets the value of the 'department' field
setDepartment(CharSequence) - Method in class org.apache.crunch.test.Employee: Sets the value of the 'department' field.
setMessage(String) - Method in class org.apache.crunch.PipelineCallable: Sets a message associated with this callable's execution, especially in case of errors.
setName(CharSequence) - Method in class org.apache.crunch.test.Employee.Builder: Sets the value of the 'name' field
setName(CharSequence) - Method in class org.apache.crunch.test.Employee: Sets the value of the 'name' field.
setName(CharSequence) - Method in class org.apache.crunch.test.Person.Builder: Sets the value of the 'name' field
setName(CharSequence) - Method in class org.apache.crunch.test.Person: Sets the value of the 'name' field.
setOffset(long) - Method in class org.apache.crunch.kafka.offset.hdfs.Offsets.PartitionOffset.Builder: Set the offset for the partition offset being built.
setOffsets(List<Offsets.PartitionOffset>) - Method in class org.apache.crunch.kafka.offset.hdfs.Offsets.Builder: Sets the collection of offsets.
setPartition(int) - Method in class org.apache.crunch.kafka.offset.hdfs.Offsets.PartitionOffset.Builder: Set the partition for the partition offset being built
setPartitionFile(Configuration, Path) - Static method in class org.apache.crunch.lib.sort.TotalOrderPartitioner
setSalary(int) - Method in class org.apache.crunch.test.Employee.Builder: Sets the value of the 'salary' field
setSalary(Integer) - Method in class org.apache.crunch.test.Employee: Sets the value of the 'salary' field.
setSiblingnames(List<CharSequence>) - Method in class org.apache.crunch.test.Person.Builder: Sets the value of the 'siblingnames' field
setSiblingnames(List<CharSequence>) - Method in class org.apache.crunch.test.Person: Sets the value of the 'siblingnames' field.
setSpecificClassLoader(ClassLoader) - Static method in class org.apache.crunch.types.avro.AvroMode: Set the ClassLoader that will be used for loading Avro org.apache.avro.specific.SpecificRecord and reflection implementation classes.
setTopic(String) - Method in class org.apache.crunch.kafka.offset.hdfs.Offsets.PartitionOffset.Builder: Set the topic for the partition offset being built
setValue(long) - Method in class org.apache.hadoop.mapred.SparkCounter
SFlatMapFunction<T,R> - Class in org.apache.crunch.fn: A Crunch-compatible abstract base class for Spark's FlatMapFunction.
SFlatMapFunction() - Constructor for class org.apache.crunch.fn.SFlatMapFunction
SFlatMapFunction2<K,V,R> - Class in org.apache.crunch.fn: A Crunch-compatible abstract base class for Spark's FlatMapFunction2.
SFlatMapFunction2() - Constructor for class org.apache.crunch.fn.SFlatMapFunction2
SFunction<T,R> - Class in org.apache.crunch.fn: A Crunch-compatible abstract base class for Spark's Function.
SFunction() - Constructor for class org.apache.crunch.fn.SFunction
SFunction<S,T> - Interface in org.apache.crunch.lambda.fn: Serializable version of the Java Function functional interface.
SFunction2<K,V,R> - Class in org.apache.crunch.fn: A Crunch-compatible abstract base class for Spark's Function2.
SFunction2() - Constructor for class org.apache.crunch.fn.SFunction2
SFunctions - Class in org.apache.crunch.fn: Utility methods for wrapping existing Spark Java API Functions for Crunch compatibility.
Shard - Class in org.apache.crunch.lib: Utilities for controlling how the data in a PCollection is balanced across reducers and output files.
Shard() - Constructor for class org.apache.crunch.lib.Shard
shard(PCollection<T>, int) - Static method in class org.apache.crunch.lib.Shard: Creates a PCollection<T> that has the same contents as its input argument but will be written to a fixed number of output files.
ShardedJoinStrategy<K,U,V> - Class in org.apache.crunch.lib.join: JoinStrategy that splits the key space up into shards.
ShardedJoinStrategy(int) - Constructor for class org.apache.crunch.lib.join.ShardedJoinStrategy: Instantiate with a constant number of shards to use for all keys.
ShardedJoinStrategy(int, int) - Constructor for class org.apache.crunch.lib.join.ShardedJoinStrategy: Instantiate with a constant number of shards to use for all keys.
ShardedJoinStrategy(ShardedJoinStrategy.ShardingStrategy<K>) - Constructor for class org.apache.crunch.lib.join.ShardedJoinStrategy: Instantiate with a custom sharding strategy.
ShardedJoinStrategy(ShardedJoinStrategy.ShardingStrategy<K>, int) - Constructor for class org.apache.crunch.lib.join.ShardedJoinStrategy: Instantiate with a custom sharding strategy and a specified number of reducers.
ShardedJoinStrategy.ShardingStrategy<K> - Interface in org.apache.crunch.lib.join: Determines over how many shards a key will be split in a sharded join.
siblingnames - Variable in class org.apache.crunch.test.Person: Deprecated.
SimpleAggregator() - Constructor for class org.apache.crunch.fn.Aggregators.SimpleAggregator
SingleKeyFn(int) - Constructor for class org.apache.crunch.lib.sort.SortFns.SingleKeyFn
SingleUseIterable<T> - Class in org.apache.crunch.impl: Wrapper around a Reducer's input Iterable.
SingleUseIterable(Iterable<T>) - Constructor for class org.apache.crunch.impl.SingleUseIterable: Instantiate around an Iterable that may only be used once.
size() - Method in class org.apache.crunch.Pair
size() - Method in interface org.apache.crunch.Tuple: Returns the number of elements in this Tuple.
size() - Method in class org.apache.crunch.Tuple3
size() - Method in class org.apache.crunch.Tuple4
size() - Method in class org.apache.crunch.TupleN
size() - Method in class org.apache.crunch.types.writable.TupleWritable: The number of children in this Tuple.
skip(String) - Method in class org.apache.crunch.contrib.text.TokenizerFactory.Builder: Sets the regular expression that determines which input characters should be ignored by the Scanner that is returned by the constructed TokenizerFactory.
smearHash(int) - Static method in class org.apache.crunch.util.HashUtil: Applies a supplemental hashing function to an integer, increasing variability in lower-order bits.
snappy(T) - Static method in class org.apache.crunch.io.Compress: Configure the given output target to be compressed using Snappy.
Sort - Class in org.apache.crunch.lib: Utilities for sorting PCollection instances.
Sort() - Constructor for class org.apache.crunch.lib.Sort
sort(PCollection<T>) - Static method in class org.apache.crunch.lib.Sort: Sorts the PCollection using the natural ordering of its elements in ascending order.
sort(PCollection<T>, Sort.Order) - Static method in class org.apache.crunch.lib.Sort: Sorts the PCollection using the natural order of its elements with the given Order.
sort(PCollection<T>, int, Sort.Order) - Static method in class org.apache.crunch.lib.Sort: Sorts the PCollection using the natural ordering of its elements in the order specified using the given number of reducers.
sort(PTable<K, V>) - Static method in class org.apache.crunch.lib.Sort: Sorts the PTable using the natural ordering of its keys in ascending order.
sort(PTable<K, V>, Sort.Order) - Static method in class org.apache.crunch.lib.Sort: Sorts the PTable using the natural ordering of its keys with the given Order.
sort(PTable<K, V>, int, Sort.Order) - Static method in class org.apache.crunch.lib.Sort: Sorts the PTable using the natural ordering of its keys in the order specified with a client-specified number of reducers.
Sort.ColumnOrder - Class in org.apache.crunch.lib: To sort by column 2 ascending then column 1 descending, you would use: sortPairs(coll, by(2, ASCENDING), by(1, DESCENDING)) Column numbering is 1-based.
Sort.Order - Enum in org.apache.crunch.lib: For signaling the order in which a sort should be done.
sortAndApply(PTable<K, Pair<V1, V2>>, DoFn<Pair<K, Iterable<Pair<V1, V2>>>, T>, PType<T>) - Static method in class org.apache.crunch.lib.SecondarySort: Perform a secondary sort on the given PTable instance and then apply a DoFn to the resulting sorted data to yield an output PCollection<T>.
sortAndApply(PTable<K, Pair<V1, V2>>, DoFn<Pair<K, Iterable<Pair<V1, V2>>>, T>, PType<T>, int) - Static method in class org.apache.crunch.lib.SecondarySort: Perform a secondary sort on the given PTable instance and then apply a DoFn to the resulting sorted data to yield an output PCollection<T>, using the given number of reducers.
sortAndApply(PTable<K, Pair<V1, V2>>, DoFn<Pair<K, Iterable<Pair<V1, V2>>>, Pair<U, V>>, PTableType<U, V>) - Static method in class org.apache.crunch.lib.SecondarySort: Perform a secondary sort on the given PTable instance and then apply a DoFn to the resulting sorted data to yield an output PTable<U, V>.
sortAndApply(PTable<K, Pair<V1, V2>>, DoFn<Pair<K, Iterable<Pair<V1, V2>>>, Pair<U, V>>, PTableType<U, V>, int) - Static method in class org.apache.crunch.lib.SecondarySort: Perform a secondary sort on the given PTable instance and then apply a DoFn to the resulting sorted data to yield an output PTable<U, V>, using the given number of reducers.
sortComparatorClass(Class<? extends RawComparator>) - Method in class org.apache.crunch.GroupingOptions.Builder
SortExample - Class in org.apache.crunch.examples: Simple Crunch tool for running sorting examples from the command line.
SortExample() - Constructor for class org.apache.crunch.examples.SortExample
SortFns - Class in org.apache.crunch.lib.sort: A set of DoFns that are used by Crunch's Sort library.
SortFns() - Constructor for class org.apache.crunch.lib.sort.SortFns
SortFns.AvroGenericFn<V extends Tuple> - Class in org.apache.crunch.lib.sort: Pulls a composite set of keys from an Avro GenericRecord instance.
SortFns.KeyExtraction<V extends Tuple> - Class in org.apache.crunch.lib.sort: Utility class for encapsulating key extraction logic and serialization information about key extraction.
SortFns.SingleKeyFn<V extends Tuple,K> - Class in org.apache.crunch.lib.sort: Extracts a single indexed key from a Tuple instance.
SortFns.TupleKeyFn<V extends Tuple,K extends Tuple> - Class in org.apache.crunch.lib.sort: Extracts a composite key from a Tuple instance.
sortPairs(PCollection<Pair<U, V>>, Sort.ColumnOrder...) - Static method in class org.apache.crunch.lib.Sort: Sorts the PCollection of Pairs using the specified column ordering.
sortQuads(PCollection<Tuple4<V1, V2, V3, V4>>, Sort.ColumnOrder...) - Static method in class org.apache.crunch.lib.Sort: Sorts the PCollection of Tuple4s using the specified column ordering.
sortTriples(PCollection<Tuple3<V1, V2, V3>>, Sort.ColumnOrder...) - Static method in class org.apache.crunch.lib.Sort: Sorts the PCollection of Tuple3s using the specified column ordering.
sortTuples(PCollection<T>, Sort.ColumnOrder...) - Static method in class org.apache.crunch.lib.Sort: Sorts the PCollection of tuples using the specified column ordering.
sortTuples(PCollection<T>, int, Sort.ColumnOrder...) - Static method in class org.apache.crunch.lib.Sort: Sorts the PCollection of TupleNs using the specified column ordering and a client-specified number of reducers.
Source<T> - Interface in org.apache.crunch: A Source represents an input data set that is an input to one or more MapReduce jobs.
sources(Source<?>...) - Method in class org.apache.crunch.ParallelDoOptions.Builder
sources(Collection<Source<?>>) - Method in class org.apache.crunch.ParallelDoOptions.Builder
sourceTarget(SourceTarget<?>) - Method in class org.apache.crunch.GroupingOptions.Builder: Deprecated.
SourceTarget<T> - Interface in org.apache.crunch: An interface for classes that implement both the Source and the Target interfaces.
SourceTargetHelper - Class in org.apache.crunch.io: Functions for configuring the inputs/outputs of MapReduce jobs.
SourceTargetHelper() - Constructor for class org.apache.crunch.io.SourceTargetHelper
sourceTargets(SourceTarget<?>...) - Method in class org.apache.crunch.GroupingOptions.Builder
sourceTargets(Collection<SourceTarget<?>>) - Method in class org.apache.crunch.GroupingOptions.Builder
sourceTargets(SourceTarget<?>...) - Method in class org.apache.crunch.ParallelDoOptions.Builder
sourceTargets(Collection<SourceTarget<?>>) - Method in class org.apache.crunch.ParallelDoOptions.Builder
SPairFlatMapFunction<T,K,V> - Class in org.apache.crunch.fn: A Crunch-compatible abstract base class for Spark's PairFlatMapFunction.
SPairFlatMapFunction() - Constructor for class org.apache.crunch.fn.SPairFlatMapFunction
SPairFunction<T,K,V> - Class in org.apache.crunch.fn: A Crunch-compatible abstract base class for Spark's PairFunction.
SPairFunction() - Constructor for class org.apache.crunch.fn.SPairFunction
SparkCollectFactory - Class in org.apache.crunch.impl.spark.collect
SparkCollectFactory() - Constructor for class org.apache.crunch.impl.spark.collect.SparkCollectFactory
SparkCollection - Interface in org.apache.crunch.impl.spark
SparkComparator - Class in org.apache.crunch.impl.spark
SparkComparator(GroupingOptions, PGroupedTableType, SparkRuntimeContext) - Constructor for class org.apache.crunch.impl.spark.SparkComparator
SparkCounter - Class in org.apache.hadoop.mapred
SparkCounter(String, String, Accumulator<Map<String, Map<String, Long>>>) - Constructor for class org.apache.hadoop.mapred.SparkCounter
SparkCounter(String, String, long) - Constructor for class org.apache.hadoop.mapred.SparkCounter
SparkPartitioner - Class in org.apache.crunch.impl.spark
SparkPartitioner(int) - Constructor for class org.apache.crunch.impl.spark.SparkPartitioner
SparkPipeline - Class in org.apache.crunch.impl.spark
SparkPipeline(String, String) - Constructor for class org.apache.crunch.impl.spark.SparkPipeline
SparkPipeline(String, String, Class<?>) - Constructor for class org.apache.crunch.impl.spark.SparkPipeline
SparkPipeline(String, String, Class<?>, Configuration) - Constructor for class org.apache.crunch.impl.spark.SparkPipeline
SparkPipeline(JavaSparkContext, String) - Constructor for class org.apache.crunch.impl.spark.SparkPipeline
SparkPipeline(JavaSparkContext, String, Class<?>, Configuration) - Constructor for class org.apache.crunch.impl.spark.SparkPipeline
SparkRuntime - Class in org.apache.crunch.impl.spark
SparkRuntime(SparkPipeline, JavaSparkContext, Configuration, Map<PCollectionImpl<?>, Set<Target>>, Map<PCollectionImpl<?>, MaterializableIterable>, Map<PCollection<?>, StorageLevel>, Map<PipelineCallable<?>, Set<Target>>) - Constructor for class org.apache.crunch.impl.spark.SparkRuntime
SparkRuntimeContext - Class in org.apache.crunch.impl.spark
SparkRuntimeContext(String, Accumulator<Map<String, Map<String, Long>>>, Broadcast<byte[]>) - Constructor for class org.apache.crunch.impl.spark.SparkRuntimeContext
SPECIFIC - Static variable in class org.apache.crunch.types.avro.AvroMode: Default mode to use for reading and writing Specific types.
specifics(Class<T>) - Static method in class org.apache.crunch.types.avro.Avros
split(PCollection<Pair<T, U>>) - Static method in class org.apache.crunch.lib.Channels: Splits a PCollection of any Pair of objects into a Pair of PCollection}, to allow for the output of a DoFn to be handled using separate channels.
split(PCollection<Pair<T, U>>, PType<T>, PType) - Static method in class org.apache.crunch.lib.Channels: Splits a PCollection of any Pair of objects into a Pair of PCollection}, to allow for the output of a DoFn to be handled using separate channels.
SPredicate<T> - Interface in org.apache.crunch.lambda.fn: Serializable version of the Java Predicate functional interface.
SSupplier<T> - Interface in org.apache.crunch.lambda.fn: Serializable version of the Java Supplier functional interface.
StageResult(String, Counters) - Constructor for class org.apache.crunch.PipelineResult.StageResult
StageResult(String, Counters, long, long) - Constructor for class org.apache.crunch.PipelineResult.StageResult
StageResult(String, String, Counters, long, long, long, long) - Constructor for class org.apache.crunch.PipelineResult.StageResult
status - Variable in class org.apache.crunch.PipelineResult
STRING_CONCAT(String, boolean) - Static method in class org.apache.crunch.fn.Aggregators: Concatenate strings, with a separator between strings.
STRING_CONCAT(String, boolean, long, long) - Static method in class org.apache.crunch.fn.Aggregators: Concatenate strings, with a separator between strings.
STRING_TO_UTF8 - Static variable in class org.apache.crunch.types.avro.Avros
strings() - Static method in class org.apache.crunch.types.avro.Avros
strings() - Method in class org.apache.crunch.types.avro.AvroTypeFamily
strings() - Method in interface org.apache.crunch.types.PTypeFamily
strings() - Static method in class org.apache.crunch.types.writable.Writables
strings() - Method in class org.apache.crunch.types.writable.WritableTypeFamily
succeeded() - Method in class org.apache.crunch.PipelineResult
SUM_BIGDECIMALS() - Static method in class org.apache.crunch.fn.Aggregators: Sum up all BigDecimal values.
SUM_BIGINTS() - Static method in class org.apache.crunch.fn.Aggregators: Sum up all BigInteger values.
SUM_DOUBLES() - Static method in class org.apache.crunch.fn.Aggregators: Sum up all double values.
SUM_FLOATS() - Static method in class org.apache.crunch.fn.Aggregators: Sum up all float values.
SUM_INTS() - Static method in class org.apache.crunch.fn.Aggregators: Sum up all int values.
SUM_LONGS() - Static method in class org.apache.crunch.fn.Aggregators: Sum up all long values.
SwapFn<V1,V2> - Class in org.apache.crunch.fn: Swap the elements of a Pair type.
SwapFn() - Constructor for class org.apache.crunch.fn.SwapFn
swapKeyValue(PTable<K, V>) - Static method in class org.apache.crunch.lib.PTables: Swap the key and value part of a table.

T

tableOf(S, T, Object...) - Static method in class org.apache.crunch.impl.mem.MemPipeline
tableOf(Iterable<Pair<S, T>>) - Static method in class org.apache.crunch.impl.mem.MemPipeline
tableOf(PType<K>, PType<V>) - Static method in class org.apache.crunch.types.avro.Avros: A table type with an Avro type as key and as value.
tableOf(PType<K>, PType<V>) - Method in class org.apache.crunch.types.avro.AvroTypeFamily
tableOf(PType<K>, PType<V>) - Method in interface org.apache.crunch.types.PTypeFamily
tableOf(PType<K>, PType<V>) - Static method in class org.apache.crunch.types.writable.Writables
tableOf(PType<K>, PType<V>) - Method in class org.apache.crunch.types.writable.WritableTypeFamily
TableSource<K,V> - Interface in org.apache.crunch: The interface Source implementations that return a PTable.
TableSourceTarget<K,V> - Interface in org.apache.crunch: An interface for classes that implement both the TableSource and the Target interfaces.
tableType(PTableType<K, V>) - Static method in class org.apache.crunch.fn.SwapFn
tagExistingKafkaConnectionProperties(Properties) - Static method in class org.apache.crunch.kafka.inputformat.KafkaInputFormat: Generates a Properties object containing the properties in connectionProperties, but with every property prefixed with "org.apache.crunch.kafka.connection.properties".
Target - Interface in org.apache.crunch: A Target represents the output destination of a Crunch PCollection in the context of a Crunch job.
Target.WriteMode - Enum in org.apache.crunch: An enum to represent different options the client may specify for handling the case where the output path, table, etc.
targets(Target...) - Method in class org.apache.crunch.ParallelDoOptions.Builder
targets(Collection<Target>) - Method in class org.apache.crunch.ParallelDoOptions.Builder
tempDir - Variable in class org.apache.crunch.test.CrunchTestSupport
TemporaryPath - Class in org.apache.crunch.test: Creates a temporary directory for a test case and destroys it afterwards.
TemporaryPath(String...) - Constructor for class org.apache.crunch.test.TemporaryPath: Construct TemporaryPath.
TestCounters - Class in org.apache.crunch.test: A utility class used during unit testing to update and read counters.
TestCounters() - Constructor for class org.apache.crunch.test.TestCounters
textFile(String) - Static method in class org.apache.crunch.io.At: Creates a SourceTarget<String> instance for the text file(s) at the given path name.
textFile(Path) - Static method in class org.apache.crunch.io.At: Creates a SourceTarget<String> instance for the text file(s) at the given Path.
textFile(String, PType<T>) - Static method in class org.apache.crunch.io.At: Creates a SourceTarget<T> instance for the text file(s) at the given path name using the provided PType<T> to convert the input text.
textFile(Path, PType<T>) - Static method in class org.apache.crunch.io.At: Creates a SourceTarget<T> instance for the text file(s) at the given Path using the provided PType<T> to convert the input text.
textFile(String) - Static method in class org.apache.crunch.io.From: Creates a Source<String> instance for the text file(s) at the given path name.
textFile(Path) - Static method in class org.apache.crunch.io.From: Creates a Source<String> instance for the text file(s) at the given Path.
textFile(List<Path>) - Static method in class org.apache.crunch.io.From: Creates a Source<String> instance for the text file(s) at the given Paths.
textFile(String, PType<T>) - Static method in class org.apache.crunch.io.From: Creates a Source<T> instance for the text file(s) at the given path name using the provided PType<T> to convert the input text.
textFile(Path, PType<T>) - Static method in class org.apache.crunch.io.From: Creates a Source<T> instance for the text file(s) at the given Path using the provided PType<T> to convert the input text.
textFile(List<Path>, PType<T>) - Static method in class org.apache.crunch.io.From: Creates a Source<T> instance for the text file(s) at the given Paths using the provided PType<T> to convert the input text.
textFile(String) - Static method in class org.apache.crunch.io.To: Creates a Target at the given path name that writes data to text files.
textFile(Path) - Static method in class org.apache.crunch.io.To: Creates a Target at the given Path that writes data to text files.
third() - Method in class org.apache.crunch.Tuple3
third() - Method in class org.apache.crunch.Tuple4
thrifts(Class<T>, PTypeFamily) - Static method in class org.apache.crunch.types.PTypes: Constructs a PType for a Thrift record.
To - Class in org.apache.crunch.io: Static factory methods for creating common Target types.
To() - Constructor for class org.apache.crunch.io.To
ToByteArrayFunction - Class in org.apache.crunch.impl.spark.collect
ToByteArrayFunction() - Constructor for class org.apache.crunch.impl.spark.collect.ToByteArrayFunction
toBytes(T) - Method in class org.apache.crunch.impl.spark.serde.AvroSerDe
toBytes(T) - Method in interface org.apache.crunch.impl.spark.serde.SerDe
toBytes(Writable) - Method in class org.apache.crunch.impl.spark.serde.WritableSerDe
toCombineFn(Aggregator<V>) - Static method in class org.apache.crunch.fn.Aggregators: Deprecated.
use the safer Aggregators.toCombineFn(Aggregator, PType) instead.
toCombineFn(Aggregator<V>, PType<V>) - Static method in class org.apache.crunch.fn.Aggregators: Wrap a CombineFn adapter around the given aggregator.
Tokenizer - Class in org.apache.crunch.contrib.text: Manages a Scanner instance and provides support for returning only a subset of the fields returned by the underlying Scanner.
Tokenizer(Scanner, Set<Integer>, boolean) - Constructor for class org.apache.crunch.contrib.text.Tokenizer: Create a new Tokenizer instance.
TokenizerFactory - Class in org.apache.crunch.contrib.text: Factory class that constructs Tokenizer instances for input strings that use a fixed set of delimiters, skip patterns, locales, and sets of indices to keep or drop.
TokenizerFactory.Builder - Class in org.apache.crunch.contrib.text: A class for constructing new TokenizerFactory instances using the Builder pattern.
top(int) - Method in class org.apache.crunch.impl.dist.collect.PTableBase
top(PTable<K, V>, int, boolean) - Static method in class org.apache.crunch.lib.Aggregate: Selects the top N pairs from the given table, with sorting being performed on the values (i.e.
top(int) - Method in interface org.apache.crunch.PTable: Returns a PTable made up of the pairs in this PTable with the largest value field.
TopKCombineFn(int, boolean, PType<Pair<K, V>>) - Constructor for class org.apache.crunch.lib.Aggregate.TopKCombineFn
TopKFn(int, boolean, PType<Pair<K, V>>) - Constructor for class org.apache.crunch.lib.Aggregate.TopKFn
TopList - Class in org.apache.crunch.lib: Tools for creating top lists of items in PTables and PCollections
TopList() - Constructor for class org.apache.crunch.lib.TopList
topNYbyX(PTable<X, Y>, int) - Static method in class org.apache.crunch.lib.TopList: Create a top-list of elements in the provided PTable, categorised by the key of the input table and using the count of the value part of the input table.
toString() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
toString() - Method in class org.apache.crunch.kafka.inputformat.KafkaInputSplit
toString() - Method in class org.apache.crunch.kafka.KafkaSource
toString() - Method in class org.apache.crunch.lib.Sort.ColumnOrder
toString() - Method in class org.apache.crunch.Pair
toString() - Method in class org.apache.crunch.Tuple3
toString() - Method in class org.apache.crunch.Tuple4
toString() - Method in class org.apache.crunch.TupleN
toString() - Method in class org.apache.crunch.types.writable.TupleWritable: Convert Tuple to String as in the following.
TotalBytesByIP - Class in org.apache.crunch.examples
TotalBytesByIP() - Constructor for class org.apache.crunch.examples.TotalBytesByIP
TotalOrderPartitioner<K,V> - Class in org.apache.crunch.lib.sort: A partition-aware Partitioner instance that can work with either Avro or Writable-formatted keys.
TotalOrderPartitioner() - Constructor for class org.apache.crunch.lib.sort.TotalOrderPartitioner
TotalOrderPartitioner.BinarySearchNode<K> - Class in org.apache.crunch.lib.sort
TotalOrderPartitioner.Node<T> - Interface in org.apache.crunch.lib.sort: Interface to the partitioner to locate a key in the partition keyset.
TotalWordCount - Class in org.apache.crunch.examples
TotalWordCount() - Constructor for class org.apache.crunch.examples.TotalWordCount
tripAggregator(Aggregator<V1>, Aggregator<V2>, Aggregator<V3>) - Static method in class org.apache.crunch.fn.Aggregators: Apply separate aggregators to each component of a Tuple3.
TripIterable(Iterable<A>, Iterable, Iterable<C>) - Constructor for class org.apache.crunch.util.Tuples.TripIterable
triples(PType<V1>, PType<V2>, PType<V3>) - Static method in class org.apache.crunch.types.avro.Avros
triples(PType<V1>, PType<V2>, PType<V3>) - Method in class org.apache.crunch.types.avro.AvroTypeFamily
triples(PType<V1>, PType<V2>, PType<V3>) - Method in interface org.apache.crunch.types.PTypeFamily
triples(PType<V1>, PType<V2>, PType<V3>) - Static method in class org.apache.crunch.types.writable.Writables
triples(PType<V1>, PType<V2>, PType<V3>) - Method in class org.apache.crunch.types.writable.WritableTypeFamily
Tuple - Interface in org.apache.crunch: A fixed-size collection of Objects, used in Crunch for representing joins between PCollections.
Tuple2MapFunction<K,V> - Class in org.apache.crunch.impl.spark.fn
Tuple2MapFunction(MapFn<Pair<K, V>, Pair<K, V>>, SparkRuntimeContext) - Constructor for class org.apache.crunch.impl.spark.fn.Tuple2MapFunction
tuple2PairFunc() - Static method in class org.apache.crunch.impl.spark.GuavaUtils
Tuple3<V1,V2,V3> - Class in org.apache.crunch: A convenience class for three-element Tuples.
Tuple3(V1, V2, V3) - Constructor for class org.apache.crunch.Tuple3
TUPLE3 - Static variable in class org.apache.crunch.types.TupleFactory
Tuple3.Collect<V1,V2,V3> - Class in org.apache.crunch
Tuple4<V1,V2,V3,V4> - Class in org.apache.crunch: A convenience class for four-element Tuples.
Tuple4(V1, V2, V3, V4) - Constructor for class org.apache.crunch.Tuple4
TUPLE4 - Static variable in class org.apache.crunch.types.TupleFactory
Tuple4.Collect<V1,V2,V3,V4> - Class in org.apache.crunch
tupleAggregator(Aggregator<?>...) - Static method in class org.apache.crunch.fn.Aggregators: Apply separate aggregators to each component of a Tuple.
TupleDeepCopier<T extends Tuple> - Class in org.apache.crunch.types: Performs deep copies (based on underlying PType deep copying) of Tuple-based objects.
TupleDeepCopier(Class<T>, PType...) - Constructor for class org.apache.crunch.types.TupleDeepCopier
TupleFactory<T extends Tuple> - Class in org.apache.crunch.types
TupleFactory() - Constructor for class org.apache.crunch.types.TupleFactory
TupleKeyFn(int[], TupleFactory) - Constructor for class org.apache.crunch.lib.sort.SortFns.TupleKeyFn
TupleN - Class in org.apache.crunch: A Tuple instance for an arbitrary number of values.
TupleN(Object...) - Constructor for class org.apache.crunch.TupleN
TUPLEN - Static variable in class org.apache.crunch.types.TupleFactory
TupleNIterable(Iterable<?>...) - Constructor for class org.apache.crunch.util.Tuples.TupleNIterable
TupleObjectInspector<T extends Tuple> - Class in org.apache.crunch.types.orc: An object inspector to define the structure of Crunch Tuples
TupleObjectInspector(TupleFactory<T>, PType...) - Constructor for class org.apache.crunch.types.orc.TupleObjectInspector
tuples(PType...) - Static method in class org.apache.crunch.types.avro.Avros
tuples(Class<T>, PType...) - Static method in class org.apache.crunch.types.avro.Avros
tuples(PType<?>...) - Method in class org.apache.crunch.types.avro.AvroTypeFamily
tuples(Class<T>, PType<?>...) - Method in class org.apache.crunch.types.avro.AvroTypeFamily
tuples(PType...) - Static method in class org.apache.crunch.types.orc.Orcs: Create a tuple-based PType.
tuples(PType<?>...) - Method in interface org.apache.crunch.types.PTypeFamily
tuples(Class<T>, PType<?>...) - Method in interface org.apache.crunch.types.PTypeFamily
tuples(PType...) - Static method in class org.apache.crunch.types.writable.Writables
tuples(Class<T>, PType...) - Static method in class org.apache.crunch.types.writable.Writables
tuples(PType<?>...) - Method in class org.apache.crunch.types.writable.WritableTypeFamily
tuples(Class<T>, PType<?>...) - Method in class org.apache.crunch.types.writable.WritableTypeFamily
Tuples - Class in org.apache.crunch.util: Utilities for working with subclasses of the Tuple interface.
Tuples() - Constructor for class org.apache.crunch.util.Tuples
Tuples.PairIterable<S,T> - Class in org.apache.crunch.util
Tuples.QuadIterable<A,B,C,D> - Class in org.apache.crunch.util
Tuples.TripIterable<A,B,C> - Class in org.apache.crunch.util
Tuples.TupleNIterable - Class in org.apache.crunch.util
TupleWritable - Class in org.apache.crunch.types.writable: A serialization format for Tuple.
TupleWritable() - Constructor for class org.apache.crunch.types.writable.TupleWritable: Create an empty tuple with no allocated storage for writables.
TupleWritable(Writable[]) - Constructor for class org.apache.crunch.types.writable.TupleWritable
TupleWritable(Writable[], int[]) - Constructor for class org.apache.crunch.types.writable.TupleWritable: Initialize tuple with storage; unknown whether any of them contain "written" values.
TupleWritable.Comparator - Class in org.apache.crunch.types.writable
TupleWritableComparator() - Constructor for class org.apache.crunch.lib.join.JoinUtils.TupleWritableComparator
TupleWritableComparator - Class in org.apache.crunch.lib.sort
TupleWritableComparator() - Constructor for class org.apache.crunch.lib.sort.TupleWritableComparator
TupleWritablePartitioner() - Constructor for class org.apache.crunch.lib.join.JoinUtils.TupleWritablePartitioner
typedCollectionOf(PType<T>, T...) - Static method in class org.apache.crunch.impl.mem.MemPipeline
typedCollectionOf(PType<T>, Iterable<T>) - Static method in class org.apache.crunch.impl.mem.MemPipeline
typedTableOf(PTableType<S, T>, S, T, Object...) - Static method in class org.apache.crunch.impl.mem.MemPipeline
typedTableOf(PTableType<S, T>, Iterable<Pair<S, T>>) - Static method in class org.apache.crunch.impl.mem.MemPipeline

U

underlying() - Method in interface org.apache.crunch.lambda.LCollection: Get the underlying PCollection for this LCollection
underlying() - Method in interface org.apache.crunch.lambda.LGroupedTable: Get the underlying PGroupedTable for this LGroupedTable
underlying() - Method in interface org.apache.crunch.lambda.LTable: Get the underlying PTable for this LCollection
ungroup() - Method in class org.apache.crunch.impl.dist.collect.BaseGroupedTable
ungroup() - Method in interface org.apache.crunch.lambda.LGroupedTable: Ungroup this LGroupedTable back into an LTable.
ungroup() - Method in interface org.apache.crunch.PGroupedTable: Convert this grouping back into a multimap.
union(PCollection<S>) - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
union(PCollection<S>...) - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
union(PTable<K, V>) - Method in class org.apache.crunch.impl.dist.collect.PTableBase
union(PTable<K, V>...) - Method in class org.apache.crunch.impl.dist.collect.PTableBase
union(List<PCollection<S>>) - Method in class org.apache.crunch.impl.dist.DistributedPipeline
union(List<PCollection<S>>) - Method in class org.apache.crunch.impl.mem.MemPipeline
union(LCollection<S>) - Method in interface org.apache.crunch.lambda.LCollection: Union this LCollection with another LCollection of the same type
union(PCollection<S>) - Method in interface org.apache.crunch.lambda.LCollection: Union this LCollection with a PCollection of the same type
union(LTable<K, V>) - Method in interface org.apache.crunch.lambda.LTable: {@inheritDoc
union(PTable<K, V>) - Method in interface org.apache.crunch.lambda.LTable: {@inheritDoc
union(PCollection<S>) - Method in interface org.apache.crunch.PCollection: Returns a PCollection instance that acts as the union of this PCollection and the given PCollection.
union(PCollection<S>...) - Method in interface org.apache.crunch.PCollection: Returns a PCollection instance that acts as the union of this PCollection and the input PCollections.
union(List<PCollection<S>>) - Method in interface org.apache.crunch.Pipeline
union(PTable<K, V>) - Method in interface org.apache.crunch.PTable: Returns a PTable instance that acts as the union of this PTable and the other PTables.
union(PTable<K, V>...) - Method in interface org.apache.crunch.PTable: Returns a PTable instance that acts as the union of this PTable and the input PTables.
Union - Class in org.apache.crunch: Allows us to represent the combination of multiple data sources that may contain different types of data as a single type with an index to indicate which of the original sources the current record was from.
Union(int, Object) - Constructor for class org.apache.crunch.Union
UnionCollection<S> - Class in org.apache.crunch.impl.spark.collect
UnionDeepCopier - Class in org.apache.crunch.types
UnionDeepCopier(PType...) - Constructor for class org.apache.crunch.types.UnionDeepCopier
unionOf(PType<?>...) - Static method in class org.apache.crunch.types.avro.Avros
unionOf(PType<?>...) - Method in class org.apache.crunch.types.avro.AvroTypeFamily
unionOf(PType<?>...) - Method in interface org.apache.crunch.types.PTypeFamily
unionOf(PType<?>...) - Static method in class org.apache.crunch.types.writable.Writables
unionOf(PType<?>...) - Method in class org.apache.crunch.types.writable.WritableTypeFamily
UnionReadableData<T> - Class in org.apache.crunch.util
UnionReadableData(List<ReadableData<T>>) - Constructor for class org.apache.crunch.util.UnionReadableData
UnionTable<K,V> - Class in org.apache.crunch.impl.spark.collect
unionTables(List<PTable<K, V>>) - Method in class org.apache.crunch.impl.dist.DistributedPipeline
unionTables(List<PTable<K, V>>) - Method in class org.apache.crunch.impl.mem.MemPipeline
unionTables(List<PTable<K, V>>) - Method in interface org.apache.crunch.Pipeline
UnionWritable - Class in org.apache.crunch.types.writable
UnionWritable() - Constructor for class org.apache.crunch.types.writable.UnionWritable
UnionWritable(int, BytesWritable) - Constructor for class org.apache.crunch.types.writable.UnionWritable
UNIQUE_ELEMENTS() - Static method in class org.apache.crunch.fn.Aggregators: Collect the unique elements of the input, as defined by the equals method for the input objects.
update(T) - Method in interface org.apache.crunch.Aggregator: Incorporate the given value into the aggregate state maintained by this instance.
update(V) - Method in class org.apache.crunch.lambda.LAggregator
useDisk(boolean) - Method in class org.apache.crunch.CachingOptions.Builder
useDisk() - Method in class org.apache.crunch.CachingOptions: Whether the framework may cache data on disk.
useMemory(boolean) - Method in class org.apache.crunch.CachingOptions.Builder
useMemory() - Method in class org.apache.crunch.CachingOptions: Whether the framework may cache data in memory without writing it to disk.
UTF8_TO_STRING - Static variable in class org.apache.crunch.types.avro.Avros
uuid(PTypeFamily) - Static method in class org.apache.crunch.types.PTypes: A PType for Java's UUID type.

V

value - Variable in class org.apache.crunch.impl.spark.ByteArray
valueClass - Variable in class org.apache.crunch.io.CrunchOutputs.OutputConfig
valueOf(String) - Static method in enum org.apache.crunch.impl.mr.MRJob.State: Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum org.apache.crunch.lib.join.JoinType: Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum org.apache.crunch.lib.Sort.Order: Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum org.apache.crunch.PipelineCallable.Status: Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum org.apache.crunch.PipelineExecution.Status: Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum org.apache.crunch.Target.WriteMode: Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum org.apache.crunch.types.avro.AvroMode.ModeType: Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum org.apache.crunch.types.avro.AvroType.AvroRecordType: Returns the enum constant of this type with the specified name.
values() - Method in class org.apache.crunch.impl.dist.collect.PTableBase
values() - Static method in enum org.apache.crunch.impl.mr.MRJob.State: Returns an array containing the constants of this enum type, in the order they are declared.
values() - Method in interface org.apache.crunch.lambda.LTable: Get an LCollection containing just the values from this table
values() - Static method in enum org.apache.crunch.lib.join.JoinType: Returns an array containing the constants of this enum type, in the order they are declared.
values(PTable<K, V>) - Static method in class org.apache.crunch.lib.PTables: Extract the values from the given PTable<K, V> as a PCollection<V>.
values() - Static method in enum org.apache.crunch.lib.Sort.Order: Returns an array containing the constants of this enum type, in the order they are declared.
values() - Static method in enum org.apache.crunch.PipelineCallable.Status: Returns an array containing the constants of this enum type, in the order they are declared.
values() - Static method in enum org.apache.crunch.PipelineExecution.Status: Returns an array containing the constants of this enum type, in the order they are declared.
values() - Method in interface org.apache.crunch.PTable: Returns a PCollection made up of the values in this PTable.
values() - Static method in enum org.apache.crunch.Target.WriteMode: Returns an array containing the constants of this enum type, in the order they are declared.
values() - Static method in enum org.apache.crunch.types.avro.AvroMode.ModeType: Returns an array containing the constants of this enum type, in the order they are declared.
values() - Static method in enum org.apache.crunch.types.avro.AvroType.AvroRecordType: Returns an array containing the constants of this enum type, in the order they are declared.
valueType() - Method in interface org.apache.crunch.lambda.LGroupedTable: Get a PType which can be used to serialize the value part of this grouped table
valueType() - Method in interface org.apache.crunch.lambda.LTable: Get a PType which can be used to serialize the value part of this table
visitDoCollection(BaseDoCollection<?>) - Method in interface org.apache.crunch.impl.dist.collect.PCollectionImpl.Visitor
visitDoTable(BaseDoTable<?, ?>) - Method in interface org.apache.crunch.impl.dist.collect.PCollectionImpl.Visitor
visitGroupedTable(BaseGroupedTable<?, ?>) - Method in interface org.apache.crunch.impl.dist.collect.PCollectionImpl.Visitor
visitInputCollection(BaseInputCollection<?>) - Method in interface org.apache.crunch.impl.dist.collect.PCollectionImpl.Visitor
visitUnionCollection(BaseUnionCollection<?>) - Method in interface org.apache.crunch.impl.dist.collect.PCollectionImpl.Visitor

W

waitFor(long, TimeUnit) - Method in class org.apache.crunch.impl.spark.SparkRuntime
waitFor(long, TimeUnit) - Method in interface org.apache.crunch.PipelineExecution: Blocks until pipeline completes or the specified waiting time elapsed.
waitUntilDone() - Method in class org.apache.crunch.impl.spark.SparkRuntime
waitUntilDone() - Method in interface org.apache.crunch.PipelineExecution: Blocks until pipeline completes, i.e.
wasLogged() - Method in exception org.apache.crunch.CrunchRuntimeException: Returns true if this exception was written to the debug logs.
weightedReservoirSample(PCollection<Pair<T, N>>, int) - Static method in class org.apache.crunch.lib.Sample: Selects a weighted sample of the elements of the given PCollection, where the second term in the input Pair is a numerical weight.
weightedReservoirSample(PCollection<Pair<T, N>>, int, Long) - Static method in class org.apache.crunch.lib.Sample: The weighted reservoir sampling function with the seed term exposed for testing purposes.
withFactory(ReaderWriterFactory) - Method in class org.apache.crunch.types.avro.AvroMode: Creates a new AvroMode instance which will utilize the factory instance for creating Avro readers and writers.
withFactoryFromConfiguration(Configuration) - Method in class org.apache.crunch.types.avro.AvroMode
WordAggregationHBase - Class in org.apache.crunch.examples: You need to have a HBase instance running.
WordAggregationHBase() - Constructor for class org.apache.crunch.examples.WordAggregationHBase
WordCount - Class in org.apache.crunch.examples
WordCount() - Constructor for class org.apache.crunch.examples.WordCount
wrap(Function<T, R>) - Static method in class org.apache.crunch.fn.SFunctions
wrap(Function2<K, V, R>) - Static method in class org.apache.crunch.fn.SFunctions
wrap(PairFunction<T, K, V>) - Static method in class org.apache.crunch.fn.SFunctions
wrap(FlatMapFunction<T, R>) - Static method in class org.apache.crunch.fn.SFunctions
wrap(FlatMapFunction2<K, V, R>) - Static method in class org.apache.crunch.fn.SFunctions
wrap(DoubleFunction<T>) - Static method in class org.apache.crunch.fn.SFunctions
wrap(DoubleFlatMapFunction<T>) - Static method in class org.apache.crunch.fn.SFunctions
wrap(PCollection<S>) - Static method in class org.apache.crunch.lambda.Lambda
wrap(PTable<K, V>) - Static method in class org.apache.crunch.lambda.Lambda
wrap(PGroupedTable<K, V>) - Static method in class org.apache.crunch.lambda.Lambda
wrap(PCollection<S>) - Method in interface org.apache.crunch.lambda.LCollectionFactory: Wrap a PCollection into an LCollection
wrap(PTable<K, V>) - Method in interface org.apache.crunch.lambda.LCollectionFactory: Wrap a PTable into an LTable
wrap(PGroupedTable<K, V>) - Method in interface org.apache.crunch.lambda.LCollectionFactory: Wrap a PGroupedTable into an LGroupedTable
WritableDeepCopier<T extends org.apache.hadoop.io.Writable> - Class in org.apache.crunch.types.writable: Performs deep copies of Writable values.
WritableDeepCopier(Class<T>) - Constructor for class org.apache.crunch.types.writable.WritableDeepCopier
WRITABLES - Static variable in class org.apache.crunch.impl.spark.ByteArrayHelper
writables(Class<T>) - Static method in class org.apache.crunch.types.avro.Avros
Writables - Class in org.apache.crunch.types.writable: Defines static methods that are analogous to the methods defined in WritableTypeFamily for convenient static importing.
writables(Class<W>) - Static method in class org.apache.crunch.types.writable.Writables
writables(Class<W>) - Method in class org.apache.crunch.types.writable.WritableTypeFamily
WritableSerDe - Class in org.apache.crunch.impl.spark.serde
WritableSerDe(Class<? extends Writable>) - Constructor for class org.apache.crunch.impl.spark.serde.WritableSerDe
WritableType<T,W extends org.apache.hadoop.io.Writable> - Class in org.apache.crunch.types.writable
WritableType(Class<T>, Class<W>, MapFn<W, T>, MapFn<T, W>, PType...) - Constructor for class org.apache.crunch.types.writable.WritableType
WritableTypeFamily - Class in org.apache.crunch.types.writable: The Writable-based implementation of the PTypeFamily interface.
write(DataOutput) - Method in class org.apache.crunch.contrib.io.jdbc.IdentifiableName
write(PreparedStatement) - Method in class org.apache.crunch.contrib.io.jdbc.IdentifiableName
write(Target) - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
write(Target, Target.WriteMode) - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
write(Target) - Method in class org.apache.crunch.impl.dist.collect.PTableBase
write(Target, Target.WriteMode) - Method in class org.apache.crunch.impl.dist.collect.PTableBase
write(PCollection<?>, Target) - Method in class org.apache.crunch.impl.dist.DistributedPipeline
write(PCollection<?>, Target, Target.WriteMode) - Method in class org.apache.crunch.impl.dist.DistributedPipeline
write(PCollection<?>, Target) - Method in class org.apache.crunch.impl.mem.MemPipeline
write(PCollection<?>, Target, Target.WriteMode) - Method in class org.apache.crunch.impl.mem.MemPipeline
write(String, K, V) - Method in class org.apache.crunch.io.CrunchOutputs
write(DataOutput) - Method in class org.apache.crunch.io.FormatBundle
write(DataOutput) - Method in class org.apache.crunch.kafka.inputformat.KafkaInputSplit
write(Map<TopicPartition, Long>) - Method in class org.apache.crunch.kafka.offset.AbstractOffsetWriter
write(long, Map<TopicPartition, Long>) - Method in class org.apache.crunch.kafka.offset.hdfs.HDFSOffsetWriter
write(Map<TopicPartition, Long>) - Method in interface org.apache.crunch.kafka.offset.OffsetWriter: Persists the offsets to a configured location with the current time specified as the as of time.
write(long, Map<TopicPartition, Long>) - Method in interface org.apache.crunch.kafka.offset.OffsetWriter: Persists the offsets to a configured location with metadata of asOfTime indicating the time in milliseconds when the offsets were meaningful.
write(Target) - Method in interface org.apache.crunch.lambda.LCollection: Write this collection to the specified Target
write(Target, Target.WriteMode) - Method in interface org.apache.crunch.lambda.LCollection: Write this collection to the specified Target with the given Target.WriteMode
write(Target) - Method in interface org.apache.crunch.lambda.LTable: Write this table to the Target supplied.
write(Target, Target.WriteMode) - Method in interface org.apache.crunch.lambda.LTable: Write this table to the Target supplied.
write(Target) - Method in interface org.apache.crunch.PCollection: Write the contents of this PCollection to the given Target, using the storage format specified by the target.
write(Target, Target.WriteMode) - Method in interface org.apache.crunch.PCollection: Write the contents of this PCollection to the given Target, using the given Target.WriteMode to handle existing targets.
write(PCollection<?>, Target) - Method in interface org.apache.crunch.Pipeline: Write the given collection to the given target on the next pipeline run.
write(PCollection<?>, Target, Target.WriteMode) - Method in interface org.apache.crunch.Pipeline: Write the contents of the PCollection to the given Target, using the storage format specified by the target and the given WriteMode for cases where the referenced Target already exists.
write(Target) - Method in interface org.apache.crunch.PTable: Writes this PTable to the given Target.
write(Target, Target.WriteMode) - Method in interface org.apache.crunch.PTable: Writes this PTable to the given Target, using the given Target.WriteMode to handle existing targets.
write(DataOutput) - Method in class org.apache.crunch.types.writable.TupleWritable: Writes each Writable to out.
write(DataOutput) - Method in class org.apache.crunch.types.writable.UnionWritable
write(PCollection<?>, Target) - Method in class org.apache.crunch.util.CrunchTool
write(Configuration, Path, Object) - Static method in class org.apache.crunch.util.DistCache
writeConnectionPropertiesToBundle(Properties, FormatBundle) - Static method in class org.apache.crunch.kafka.inputformat.KafkaInputFormat: Writes the Kafka connection properties to the bundle.
writeOffsetsToBundle(Map<TopicPartition, Pair<Long, Long>>, FormatBundle) - Static method in class org.apache.crunch.kafka.inputformat.KafkaInputFormat: Writes the start and end offsets for the provided topic partitions to the bundle.
writeOffsetsToConfiguration(Map<TopicPartition, Pair<Long, Long>>, Configuration) - Static method in class org.apache.crunch.kafka.inputformat.KafkaInputFormat: Writes the start and end offsets for the provided topic partitions to the config.
writeTextFile(PCollection<T>, String) - Method in class org.apache.crunch.impl.dist.DistributedPipeline
writeTextFile(PCollection<T>, String) - Method in class org.apache.crunch.impl.mem.MemPipeline
writeTextFile(PCollection<T>, String) - Method in interface org.apache.crunch.Pipeline: A convenience method for writing a text file.
writeTextFile(PCollection<?>, String) - Method in class org.apache.crunch.util.CrunchTool

X

xboolean() - Static method in class org.apache.crunch.contrib.text.Extractors: Returns an Extractor for booleans.
xboolean(Boolean) - Static method in class org.apache.crunch.contrib.text.Extractors
xcollect(TokenizerFactory, Extractor<T>) - Static method in class org.apache.crunch.contrib.text.Extractors
xcustom(Class<T>, TokenizerFactory, Extractor...) - Static method in class org.apache.crunch.contrib.text.Extractors: Returns an Extractor for a subclass of Tuple with a constructor that has the given extractor types that uses the given TokenizerFactory for parsing the sub-fields.
xdouble() - Static method in class org.apache.crunch.contrib.text.Extractors: Returns an Extractor for doubles.
xdouble(Double) - Static method in class org.apache.crunch.contrib.text.Extractors
xfloat() - Static method in class org.apache.crunch.contrib.text.Extractors: Returns an Extractor for floats.
xfloat(Float) - Static method in class org.apache.crunch.contrib.text.Extractors
xint() - Static method in class org.apache.crunch.contrib.text.Extractors: Returns an Extractor for integers.
xint(Integer) - Static method in class org.apache.crunch.contrib.text.Extractors: Returns an Extractor for integers.
xlong() - Static method in class org.apache.crunch.contrib.text.Extractors: Returns an Extractor for longs.
xlong(Long) - Static method in class org.apache.crunch.contrib.text.Extractors: Returns an Extractor for longs.
xpair(TokenizerFactory, Extractor<K>, Extractor<V>) - Static method in class org.apache.crunch.contrib.text.Extractors: Returns an Extractor for pairs of the given types that uses the given TokenizerFactory for parsing the sub-fields.
xquad(TokenizerFactory, Extractor<A>, Extractor, Extractor<C>, Extractor<D>) - Static method in class org.apache.crunch.contrib.text.Extractors: Returns an Extractor for quads of the given types that uses the given TokenizerFactory for parsing the sub-fields.
xstring() - Static method in class org.apache.crunch.contrib.text.Extractors: Returns an Extractor for strings.
xstring(String) - Static method in class org.apache.crunch.contrib.text.Extractors
xtriple(TokenizerFactory, Extractor<A>, Extractor, Extractor<C>) - Static method in class org.apache.crunch.contrib.text.Extractors: Returns an Extractor for triples of the given types that uses the given TokenizerFactory for parsing the sub-fields.
xtupleN(TokenizerFactory, Extractor...) - Static method in class org.apache.crunch.contrib.text.Extractors: Returns an Extractor for an arbitrary number of types that uses the given TokenizerFactory for parsing the sub-fields.

Z

zero(Map<String, Map<String, Long>>) - Method in class org.apache.crunch.impl.spark.CounterAccumulatorParam

A B C D E F G H I J K L M N O P Q R S T U V W X Z