This project has retired. For details please refer to its Attic page.
Index (Apache Crunch 0.9.0 API)
A B C D E F G H I J K L M N O P Q R S T U V W X Z

A

AbstractCompositeExtractor<T> - Class in org.apache.crunch.contrib.text
Base class for Extractor instances that delegates the parsing of fields to other Extractor instances, primarily used for constructing composite records that implement the Tuple interface.
AbstractCompositeExtractor(TokenizerFactory, List<Extractor<?>>) - Constructor for class org.apache.crunch.contrib.text.AbstractCompositeExtractor
 
AbstractSimpleExtractor<T> - Class in org.apache.crunch.contrib.text
Base class for the common case Extractor instances that construct a single object from a block of text stored in a String, with support for error handling and reporting.
AbstractSimpleExtractor(T) - Constructor for class org.apache.crunch.contrib.text.AbstractSimpleExtractor
 
AbstractSimpleExtractor(T, TokenizerFactory) - Constructor for class org.apache.crunch.contrib.text.AbstractSimpleExtractor
 
accept(T) - Method in class org.apache.crunch.FilterFn
If true, emit the given record.
accept(PCollectionImpl.Visitor) - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
accept(PCollectionImpl.Visitor) - Method in class org.apache.crunch.impl.mr.collect.PGroupedTableImpl
 
accept(OutputHandler, PType<?>) - Method in class org.apache.crunch.io.avro.AvroFileTarget
 
accept(OutputHandler, PType<?>) - Method in class org.apache.crunch.io.avro.AvroPathPerKeyTarget
 
accept(OutputHandler, PType<?>) - Method in class org.apache.crunch.io.avro.trevni.TrevniKeyTarget
 
accept(OutputHandler, PType<?>) - Method in class org.apache.crunch.io.hbase.HBaseTarget
 
accept(OutputHandler, PType<?>) - Method in class org.apache.crunch.io.impl.FileTargetImpl
 
accept(OutputHandler, PType<?>) - Method in class org.apache.crunch.io.parquet.AvroParquetFileTarget
 
accept(OutputHandler, PType<?>) - Method in interface org.apache.crunch.Target
Checks to see if this Target instance is compatible with the given PType.
ACCEPT_ALL() - Static method in class org.apache.crunch.fn.FilterFns
Accept everything.
acceptInternal(PCollectionImpl.Visitor) - Method in class org.apache.crunch.impl.dist.collect.BaseDoCollection
 
acceptInternal(PCollectionImpl.Visitor) - Method in class org.apache.crunch.impl.dist.collect.BaseDoTable
 
acceptInternal(PCollectionImpl.Visitor) - Method in class org.apache.crunch.impl.dist.collect.BaseGroupedTable
 
acceptInternal(PCollectionImpl.Visitor) - Method in class org.apache.crunch.impl.dist.collect.BaseInputCollection
 
acceptInternal(PCollectionImpl.Visitor) - Method in class org.apache.crunch.impl.dist.collect.BaseInputTable
 
acceptInternal(PCollectionImpl.Visitor) - Method in class org.apache.crunch.impl.dist.collect.BaseUnionCollection
 
acceptInternal(PCollectionImpl.Visitor) - Method in class org.apache.crunch.impl.dist.collect.BaseUnionTable
 
acceptInternal(PCollectionImpl.Visitor) - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
addAccumulator(Map<String, Long>, Map<String, Long>) - Method in class org.apache.crunch.impl.spark.CounterAccumulatorParam
 
addCacheFile(Path, Configuration) - Static method in class org.apache.crunch.util.DistCache
 
addChild(DoNode) - Method in class org.apache.crunch.impl.mr.plan.DoNode
 
addInPlace(Map<String, Long>, Map<String, Long>) - Method in class org.apache.crunch.impl.spark.CounterAccumulatorParam
 
addInputPath(Job, Path, FormatBundle, int) - Static method in class org.apache.crunch.io.CrunchInputs
 
addJarDirToDistributedCache(Configuration, File) - Static method in class org.apache.crunch.util.DistCache
Adds all jars under the specified directory to the distributed cache of jobs using the provided configuration.
addJarDirToDistributedCache(Configuration, String) - Static method in class org.apache.crunch.util.DistCache
Adds all jars under the directory at the specified path to the distributed cache of jobs using the provided configuration.
addJarToDistributedCache(Configuration, File) - Static method in class org.apache.crunch.util.DistCache
Adds the specified jar to the distributed cache of jobs using the provided configuration.
addJarToDistributedCache(Configuration, String) - Static method in class org.apache.crunch.util.DistCache
Adds the jar at the specified path to the distributed cache of jobs using the provided configuration.
addJob(CrunchControlledJob) - Method in class org.apache.crunch.impl.mr.exec.MRExecutor
 
addJobPrototype(JobPrototype) - Method in class org.apache.crunch.impl.mr.plan.DotfileWriter
Add the contents of a JobPrototype to the graph describing a pipeline.
addNamedOutput(Job, String, Class<? extends OutputFormat>, Class, Class) - Static method in class org.apache.crunch.io.CrunchOutputs
 
addNamedOutput(Job, String, FormatBundle<? extends OutputFormat>, Class, Class) - Static method in class org.apache.crunch.io.CrunchOutputs
 
Aggregate - Class in org.apache.crunch.lib
Methods for performing various types of aggregations over PCollection instances.
Aggregate() - Constructor for class org.apache.crunch.lib.Aggregate
 
Aggregate.PairValueComparator<K,V> - Class in org.apache.crunch.lib
 
Aggregate.PairValueComparator(boolean) - Constructor for class org.apache.crunch.lib.Aggregate.PairValueComparator
 
Aggregate.TopKCombineFn<K,V> - Class in org.apache.crunch.lib
 
Aggregate.TopKCombineFn(int, boolean) - Constructor for class org.apache.crunch.lib.Aggregate.TopKCombineFn
 
Aggregate.TopKFn<K,V> - Class in org.apache.crunch.lib
 
Aggregate.TopKFn(int, boolean) - Constructor for class org.apache.crunch.lib.Aggregate.TopKFn
 
Aggregator<T> - Interface in org.apache.crunch
Aggregate a sequence of values into a possibly smaller sequence of the same type.
Aggregators - Class in org.apache.crunch.fn
A collection of pre-defined Aggregators.
Aggregators.SimpleAggregator<T> - Class in org.apache.crunch.fn
Base class for aggregators that do not require any initialization.
Aggregators.SimpleAggregator() - Constructor for class org.apache.crunch.fn.Aggregators.SimpleAggregator
 
and(FilterFn<S>, FilterFn<S>) - Static method in class org.apache.crunch.fn.FilterFns
Accept an entry if all of the given filters accept it, using short-circuit evaluation.
and(FilterFn<S>...) - Static method in class org.apache.crunch.fn.FilterFns
Accept an entry if all of the given filters accept it, using short-circuit evaluation.
apply(Statement, Description) - Method in class org.apache.crunch.test.TemporaryPath
 
applyPTypeTransforms() - Method in class org.apache.crunch.io.hbase.HBaseValueConverter
 
applyPTypeTransforms() - Method in interface org.apache.crunch.types.Converter
If true, convert the inputs or outputs from this Converter instance before (for outputs) or after (for inputs) using the associated PType#getInputMapFn and PType#getOutputMapFn calls.
as(PType<T>) - Method in class org.apache.crunch.types.avro.AvroTypeFamily
 
as(PType<T>) - Method in interface org.apache.crunch.types.PTypeFamily
Returns the equivalent of the given ptype for this family, if it exists.
as(PType<T>) - Method in class org.apache.crunch.types.writable.WritableTypeFamily
 
asCollection - Variable in class org.apache.crunch.impl.dist.collect.BaseInputTable
 
asCollection() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
asCollection() - Method in class org.apache.crunch.impl.mem.collect.MemCollection
asCollection() - Method in interface org.apache.crunch.PCollection
 
asMap() - Method in class org.apache.crunch.impl.dist.collect.PTableBase
Returns a PObject encapsulating a Map made up of the keys and values in this PTable.
asMap() - Method in class org.apache.crunch.impl.mem.collect.MemTable
 
asMap() - Method in interface org.apache.crunch.PTable
Returns a PObject encapsulating a Map made up of the keys and values in this PTable.
asPTable(PCollection<Pair<K, V>>) - Static method in class org.apache.crunch.lib.PTables
Convert the given PCollection<Pair<K, V>> to a PTable<K, V>.
asReadable(boolean) - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
asReadable(boolean) - Method in class org.apache.crunch.impl.mem.collect.MemCollection
 
asReadable() - Method in class org.apache.crunch.io.avro.AvroFileSource
 
asReadable() - Method in class org.apache.crunch.io.avro.trevni.TrevniKeySource
 
asReadable() - Method in class org.apache.crunch.io.hbase.HBaseSourceTarget
 
asReadable() - Method in class org.apache.crunch.io.hbase.HFileSource
 
asReadable() - Method in class org.apache.crunch.io.impl.ReadableSourcePathTargetImpl
 
asReadable() - Method in class org.apache.crunch.io.impl.ReadableSourceTargetImpl
 
asReadable() - Method in class org.apache.crunch.io.parquet.AvroParquetFileSource
 
asReadable() - Method in interface org.apache.crunch.io.ReadableSource
 
asReadable() - Method in class org.apache.crunch.io.seq.SeqFileSource
 
asReadable() - Method in class org.apache.crunch.io.seq.SeqFileTableSource
 
asReadable() - Method in class org.apache.crunch.io.text.NLineFileSource
 
asReadable() - Method in class org.apache.crunch.io.text.TextFileSource
 
asReadable() - Method in class org.apache.crunch.io.text.TextFileTableSource
 
asReadable(boolean) - Method in interface org.apache.crunch.PCollection
 
asSourceTarget(PType<T>) - Method in class org.apache.crunch.io.avro.AvroFileTarget
 
asSourceTarget(PType<T>) - Method in class org.apache.crunch.io.avro.trevni.TrevniKeyTarget
 
asSourceTarget(PType<T>) - Method in class org.apache.crunch.io.hbase.HBaseTarget
 
asSourceTarget(PType<T>) - Method in class org.apache.crunch.io.impl.FileTargetImpl
 
asSourceTarget(PType<T>) - Method in class org.apache.crunch.io.parquet.AvroParquetFileTarget
 
asSourceTarget(PType<T>) - Method in class org.apache.crunch.io.seq.SeqFileTarget
 
asSourceTarget(PType<T>) - Method in class org.apache.crunch.io.text.TextFileTarget
 
asSourceTarget(PType<T>) - Method in interface org.apache.crunch.Target
Attempt to create the SourceTarget type that corresponds to this Target for the given PType, if possible.
At - Class in org.apache.crunch.io
Static factory methods for creating common SourceTarget types, which may be treated as both a Source and a Target.
At() - Constructor for class org.apache.crunch.io.At
 
at - Static variable in class org.apache.crunch.util.CrunchTool
 
AtHBase - Class in org.apache.crunch.io.hbase
Static factory methods for creating HBase SourceTarget types.
AtHBase() - Constructor for class org.apache.crunch.io.hbase.AtHBase
 
AutoClosingIterator<T> - Class in org.apache.crunch.io.impl
Closes the wrapped Closeable when AutoClosingIterator.hasNext() returns false.
AutoClosingIterator(Closeable, Iterator<T>) - Constructor for class org.apache.crunch.io.impl.AutoClosingIterator
 
AverageBytesByIP - Class in org.apache.crunch.examples
 
AverageBytesByIP() - Constructor for class org.apache.crunch.examples.AverageBytesByIP
 
AVRO_MODE_PROPERTY - Static variable in enum org.apache.crunch.types.avro.AvroMode
 
avroFile(String, Class<T>) - Static method in class org.apache.crunch.io.At
Creates a SourceTarget<T> instance from the Avro file(s) at the given path name.
avroFile(Path, Class<T>) - Static method in class org.apache.crunch.io.At
Creates a SourceTarget<T> instance from the Avro file(s) at the given Path.
avroFile(String, AvroType<T>) - Static method in class org.apache.crunch.io.At
Creates a SourceTarget<T> instance from the Avro file(s) at the given path name.
avroFile(Path, AvroType<T>) - Static method in class org.apache.crunch.io.At
Creates a SourceTarget<T> instance from the Avro file(s) at the given Path.
avroFile(String, Class<T>) - Static method in class org.apache.crunch.io.From
Creates a Source<T> instance from the Avro file(s) at the given path name.
avroFile(Path, Class<T>) - Static method in class org.apache.crunch.io.From
Creates a Source<T> instance from the Avro file(s) at the given Path.
avroFile(String, AvroType<T>) - Static method in class org.apache.crunch.io.From
Creates a Source<T> instance from the Avro file(s) at the given path name.
avroFile(Path, AvroType<T>) - Static method in class org.apache.crunch.io.From
Creates a Source<T> instance from the Avro file(s) at the given Path.
avroFile(String) - Static method in class org.apache.crunch.io.To
Creates a Target at the given path name that writes data to Avro files.
avroFile(Path) - Static method in class org.apache.crunch.io.To
Creates a Target at the given Path that writes data to Avro files.
AvroFileReaderFactory<T> - Class in org.apache.crunch.io.avro
 
AvroFileReaderFactory(AvroType<T>) - Constructor for class org.apache.crunch.io.avro.AvroFileReaderFactory
 
AvroFileReaderFactory(DatumReader<T>, AvroType<T>) - Constructor for class org.apache.crunch.io.avro.AvroFileReaderFactory
 
AvroFileReaderFactory(Schema) - Constructor for class org.apache.crunch.io.avro.AvroFileReaderFactory
 
AvroFileSource<T> - Class in org.apache.crunch.io.avro
 
AvroFileSource(Path, AvroType<T>) - Constructor for class org.apache.crunch.io.avro.AvroFileSource
 
AvroFileSource(Path, AvroType<T>, DatumReader<T>) - Constructor for class org.apache.crunch.io.avro.AvroFileSource
 
AvroFileSource(List<Path>, AvroType<T>) - Constructor for class org.apache.crunch.io.avro.AvroFileSource
 
AvroFileSource(List<Path>, AvroType<T>, DatumReader<T>) - Constructor for class org.apache.crunch.io.avro.AvroFileSource
 
AvroFileSourceTarget<T> - Class in org.apache.crunch.io.avro
 
AvroFileSourceTarget(Path, AvroType<T>) - Constructor for class org.apache.crunch.io.avro.AvroFileSourceTarget
 
AvroFileSourceTarget(Path, AvroType<T>, DatumReader<T>) - Constructor for class org.apache.crunch.io.avro.AvroFileSourceTarget
 
AvroFileSourceTarget(Path, AvroType<T>, FileNamingScheme) - Constructor for class org.apache.crunch.io.avro.AvroFileSourceTarget
 
AvroFileSourceTarget(Path, AvroType<T>, DatumReader<T>, FileNamingScheme) - Constructor for class org.apache.crunch.io.avro.AvroFileSourceTarget
 
AvroFileTarget - Class in org.apache.crunch.io.avro
 
AvroFileTarget(String) - Constructor for class org.apache.crunch.io.avro.AvroFileTarget
 
AvroFileTarget(Path) - Constructor for class org.apache.crunch.io.avro.AvroFileTarget
 
AvroFileTarget(Path, FileNamingScheme) - Constructor for class org.apache.crunch.io.avro.AvroFileTarget
 
AvroInputFormat<T> - Class in org.apache.crunch.types.avro
An InputFormat for Avro data files.
AvroInputFormat() - Constructor for class org.apache.crunch.types.avro.AvroInputFormat
 
AvroMode - Enum in org.apache.crunch.types.avro
 
AvroOutputFormat<T> - Class in org.apache.crunch.types.avro
An OutputFormat for Avro data files.
AvroOutputFormat() - Constructor for class org.apache.crunch.types.avro.AvroOutputFormat
 
AvroParquetFileSource<T extends org.apache.avro.generic.IndexedRecord> - Class in org.apache.crunch.io.parquet
 
AvroParquetFileSource(Path, AvroType<T>) - Constructor for class org.apache.crunch.io.parquet.AvroParquetFileSource
 
AvroParquetFileSource(Path, AvroType<T>, Schema) - Constructor for class org.apache.crunch.io.parquet.AvroParquetFileSource
 
AvroParquetFileSource(List<Path>, AvroType<T>) - Constructor for class org.apache.crunch.io.parquet.AvroParquetFileSource
 
AvroParquetFileSource(List<Path>, AvroType<T>, Schema) - Constructor for class org.apache.crunch.io.parquet.AvroParquetFileSource
 
AvroParquetFileSource(List<Path>, AvroType<T>, Class<? extends UnboundRecordFilter>) - Constructor for class org.apache.crunch.io.parquet.AvroParquetFileSource
 
AvroParquetFileSource(List<Path>, AvroType<T>, Schema, Class<? extends UnboundRecordFilter>) - Constructor for class org.apache.crunch.io.parquet.AvroParquetFileSource
 
AvroParquetFileSource.Builder<T extends org.apache.avro.generic.IndexedRecord> - Class in org.apache.crunch.io.parquet
Helper class for constructing an AvroParquetFileSource that only reads a subset of the fields defined in an Avro schema.
AvroParquetFileSourceTarget<T extends org.apache.avro.generic.IndexedRecord> - Class in org.apache.crunch.io.parquet
 
AvroParquetFileSourceTarget(Path, AvroType<T>) - Constructor for class org.apache.crunch.io.parquet.AvroParquetFileSourceTarget
 
AvroParquetFileSourceTarget(Path, AvroType<T>, FileNamingScheme) - Constructor for class org.apache.crunch.io.parquet.AvroParquetFileSourceTarget
 
AvroParquetFileTarget - Class in org.apache.crunch.io.parquet
 
AvroParquetFileTarget(String) - Constructor for class org.apache.crunch.io.parquet.AvroParquetFileTarget
 
AvroParquetFileTarget(Path) - Constructor for class org.apache.crunch.io.parquet.AvroParquetFileTarget
 
AvroParquetFileTarget(Path, FileNamingScheme) - Constructor for class org.apache.crunch.io.parquet.AvroParquetFileTarget
 
AvroParquetReadableData<T> - Class in org.apache.crunch.io.parquet
 
AvroParquetReadableData(List<Path>, AvroType<T>) - Constructor for class org.apache.crunch.io.parquet.AvroParquetReadableData
 
AvroPathPerKeyOutputFormat<T> - Class in org.apache.crunch.types.avro
A FileOutputFormat that takes in a Utf8 and an Avro record and writes the Avro records to a sub-directory of the output path whose name is equal to the string-form of the Utf8.
AvroPathPerKeyOutputFormat() - Constructor for class org.apache.crunch.types.avro.AvroPathPerKeyOutputFormat
 
AvroPathPerKeyTarget - Class in org.apache.crunch.io.avro
A Target that wraps AvroPathPerKeyOutputFormat to allow one file per key to be written as the output of a PTable<String, T>.
AvroPathPerKeyTarget(String) - Constructor for class org.apache.crunch.io.avro.AvroPathPerKeyTarget
 
AvroPathPerKeyTarget(Path) - Constructor for class org.apache.crunch.io.avro.AvroPathPerKeyTarget
 
AvroPathPerKeyTarget(Path, FileNamingScheme) - Constructor for class org.apache.crunch.io.avro.AvroPathPerKeyTarget
 
AvroReadableData<T> - Class in org.apache.crunch.io.avro
 
AvroReadableData(List<Path>, AvroType<T>) - Constructor for class org.apache.crunch.io.avro.AvroReadableData
 
Avros - Class in org.apache.crunch.types.avro
Defines static methods that are analogous to the methods defined in AvroTypeFamily for convenient static importing.
AvroSerDe<T> - Class in org.apache.crunch.impl.spark.serde
 
AvroSerDe(AvroType<T>) - Constructor for class org.apache.crunch.impl.spark.serde.AvroSerDe
 
AvroTextOutputFormat<K,V> - Class in org.apache.crunch.types.avro
 
AvroTextOutputFormat() - Constructor for class org.apache.crunch.types.avro.AvroTextOutputFormat
 
AvroType<T> - Class in org.apache.crunch.types.avro
The implementation of the PType interface for Avro-based serialization.
AvroType(Class<T>, Schema, DeepCopier<T>, PType...) - Constructor for class org.apache.crunch.types.avro.AvroType
 
AvroType(Class<T>, Schema, MapFn, MapFn, DeepCopier<T>, PType...) - Constructor for class org.apache.crunch.types.avro.AvroType
 
AvroTypeFamily - Class in org.apache.crunch.types.avro
 
AvroUtf8InputFormat - Class in org.apache.crunch.types.avro
An InputFormat for text files.
AvroUtf8InputFormat() - Constructor for class org.apache.crunch.types.avro.AvroUtf8InputFormat
 

B

BaseDoCollection<S> - Class in org.apache.crunch.impl.dist.collect
 
BaseDoCollection(String, PCollectionImpl<T>, DoFn<T, S>, PType<S>, ParallelDoOptions) - Constructor for class org.apache.crunch.impl.dist.collect.BaseDoCollection
 
BaseDoTable<K,V> - Class in org.apache.crunch.impl.dist.collect
 
BaseDoTable(String, PCollectionImpl<S>, DoFn<S, Pair<K, V>>, PTableType<K, V>, ParallelDoOptions) - Constructor for class org.apache.crunch.impl.dist.collect.BaseDoTable
 
BaseDoTable(String, PCollectionImpl<S>, CombineFn<K, V>, DoFn<S, Pair<K, V>>, PTableType<K, V>) - Constructor for class org.apache.crunch.impl.dist.collect.BaseDoTable
 
BaseDoTable(String, PCollectionImpl<S>, CombineFn<K, V>, DoFn<S, Pair<K, V>>, PTableType<K, V>, ParallelDoOptions) - Constructor for class org.apache.crunch.impl.dist.collect.BaseDoTable
 
BaseGroupedTable<K,V> - Class in org.apache.crunch.impl.dist.collect
 
BaseGroupedTable(PTableBase<K, V>) - Constructor for class org.apache.crunch.impl.dist.collect.BaseGroupedTable
 
BaseGroupedTable(PTableBase<K, V>, GroupingOptions) - Constructor for class org.apache.crunch.impl.dist.collect.BaseGroupedTable
 
BaseInputCollection<S> - Class in org.apache.crunch.impl.dist.collect
 
BaseInputCollection(Source<S>, DistributedPipeline) - Constructor for class org.apache.crunch.impl.dist.collect.BaseInputCollection
 
BaseInputTable<K,V> - Class in org.apache.crunch.impl.dist.collect
 
BaseInputTable(TableSource<K, V>, DistributedPipeline) - Constructor for class org.apache.crunch.impl.dist.collect.BaseInputTable
 
BaseUnionCollection<S> - Class in org.apache.crunch.impl.dist.collect
 
BaseUnionCollection(List<? extends PCollectionImpl<S>>) - Constructor for class org.apache.crunch.impl.dist.collect.BaseUnionCollection
 
BaseUnionTable<K,V> - Class in org.apache.crunch.impl.dist.collect
 
BaseUnionTable(List<PTableBase<K, V>>) - Constructor for class org.apache.crunch.impl.dist.collect.BaseUnionTable
 
bigInt(PTypeFamily) - Static method in class org.apache.crunch.types.PTypes
 
BIGINT_TO_BYTE - Static variable in class org.apache.crunch.types.PTypes
 
BloomFilterFactory - Class in org.apache.crunch.contrib.bloomfilter
Factory Class for creating BloomFilters.
BloomFilterFactory() - Constructor for class org.apache.crunch.contrib.bloomfilter.BloomFilterFactory
 
BloomFilterFn<S> - Class in org.apache.crunch.contrib.bloomfilter
The class is responsible for generating keys that are used in a BloomFilter
BloomFilterFn() - Constructor for class org.apache.crunch.contrib.bloomfilter.BloomFilterFn
 
BloomFilterJoinStrategy<K,U,V> - Class in org.apache.crunch.lib.join
Join strategy that uses a Bloom filter that is trained on the keys of the left-side table to filter the key/value pairs of the right-side table before sending through the shuffle and reduce phase.
BloomFilterJoinStrategy(int) - Constructor for class org.apache.crunch.lib.join.BloomFilterJoinStrategy
Instantiate with the expected number of unique keys in the left table.
BloomFilterJoinStrategy(int, float) - Constructor for class org.apache.crunch.lib.join.BloomFilterJoinStrategy
Instantiate with the expected number of unique keys in the left table, and the acceptable false positive rate for the Bloom filter.
BloomFilterJoinStrategy(int, float, JoinStrategy<K, U, V>) - Constructor for class org.apache.crunch.lib.join.BloomFilterJoinStrategy
Instantiate with the expected number of unique keys in the left table, and the acceptable false positive rate for the Bloom filter, and an underlying join strategy to delegate to.
booleans() - Static method in class org.apache.crunch.types.avro.Avros
 
booleans() - Method in class org.apache.crunch.types.avro.AvroTypeFamily
 
booleans() - Method in interface org.apache.crunch.types.PTypeFamily
 
booleans() - Static method in class org.apache.crunch.types.writable.Writables
 
booleans() - Method in class org.apache.crunch.types.writable.WritableTypeFamily
 
bottom(int) - Method in class org.apache.crunch.impl.dist.collect.PTableBase
 
bottom(int) - Method in class org.apache.crunch.impl.mem.collect.MemTable
 
bottom(int) - Method in interface org.apache.crunch.PTable
Returns a PTable made up of the pairs in this PTable with the smallest value field.
build() - Method in class org.apache.crunch.CachingOptions.Builder
 
build() - Method in class org.apache.crunch.contrib.text.TokenizerFactory.Builder
Returns a new TokenizerFactory with settings determined by this Builder instance.
build() - Method in class org.apache.crunch.GroupingOptions.Builder
 
build(Path) - Method in class org.apache.crunch.io.parquet.AvroParquetFileSource.Builder
 
build(List<Path>) - Method in class org.apache.crunch.io.parquet.AvroParquetFileSource.Builder
 
build() - Method in class org.apache.crunch.ParallelDoOptions.Builder
 
buildDotfile() - Method in class org.apache.crunch.impl.mr.plan.DotfileWriter
Build up the full dot file containing the description of a MapReduce pipeline.
builder() - Static method in class org.apache.crunch.CachingOptions
Creates a new CachingOptions.Builder instance to use for specifying the caching options for a particular PCollection<T>.
builder() - Static method in class org.apache.crunch.contrib.text.TokenizerFactory
Factory method for creating a TokenizerFactory.Builder instance.
builder() - Static method in class org.apache.crunch.GroupingOptions
 
builder(Class<T>) - Static method in class org.apache.crunch.io.parquet.AvroParquetFileSource
 
builder(Schema) - Static method in class org.apache.crunch.io.parquet.AvroParquetFileSource
 
builder() - Static method in class org.apache.crunch.ParallelDoOptions
 
by(MapFn<S, K>, PType<K>) - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
by(String, MapFn<S, K>, PType<K>) - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
by(MapFn<S, K>, PType<K>) - Method in class org.apache.crunch.impl.mem.collect.MemCollection
 
by(String, MapFn<S, K>, PType<K>) - Method in class org.apache.crunch.impl.mem.collect.MemCollection
 
by(int, Sort.Order) - Static method in class org.apache.crunch.lib.Sort.ColumnOrder
 
by(MapFn<S, K>, PType<K>) - Method in interface org.apache.crunch.PCollection
Apply the given map function to each element of this instance in order to create a PTable.
by(String, MapFn<S, K>, PType<K>) - Method in interface org.apache.crunch.PCollection
Apply the given map function to each element of this instance in order to create a PTable.
BYTE_TO_BIGINT - Static variable in class org.apache.crunch.types.PTypes
 
ByteArray - Class in org.apache.crunch.impl.spark
 
ByteArray(byte[]) - Constructor for class org.apache.crunch.impl.spark.ByteArray
 
bytes() - Static method in class org.apache.crunch.types.avro.Avros
 
bytes() - Method in class org.apache.crunch.types.avro.AvroTypeFamily
 
bytes() - Method in interface org.apache.crunch.types.PTypeFamily
 
bytes() - Static method in class org.apache.crunch.types.writable.Writables
 
bytes() - Method in class org.apache.crunch.types.writable.WritableTypeFamily
 
BYTES_IN - Static variable in class org.apache.crunch.types.avro.Avros
 
BYTES_PER_REDUCE_TASK - Static variable in class org.apache.crunch.util.PartitionUtils
 
bytesToKeyValue(BytesWritable) - Static method in class org.apache.crunch.io.hbase.HBaseTypes
 
bytesToKeyValue(byte[], int, int) - Static method in class org.apache.crunch.io.hbase.HBaseTypes
 

C

cache() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
cache(CachingOptions) - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
cache() - Method in class org.apache.crunch.impl.dist.collect.PTableBase
 
cache(CachingOptions) - Method in class org.apache.crunch.impl.dist.collect.PTableBase
 
cache() - Method in class org.apache.crunch.impl.mem.collect.MemCollection
 
cache(CachingOptions) - Method in class org.apache.crunch.impl.mem.collect.MemCollection
 
cache() - Method in class org.apache.crunch.impl.mem.collect.MemTable
 
cache(CachingOptions) - Method in class org.apache.crunch.impl.mem.collect.MemTable
 
cache(PCollection<T>, CachingOptions) - Method in class org.apache.crunch.impl.mem.MemPipeline
 
cache(PCollection<T>, CachingOptions) - Method in class org.apache.crunch.impl.mr.MRPipeline
 
cache(PCollection<T>, CachingOptions) - Method in class org.apache.crunch.impl.spark.SparkPipeline
 
cache() - Method in interface org.apache.crunch.PCollection
Marks this data as cached using the default CachingOptions.
cache(CachingOptions) - Method in interface org.apache.crunch.PCollection
Marks this data as cached using the given CachingOptions.
cache(PCollection<T>, CachingOptions) - Method in interface org.apache.crunch.Pipeline
Caches the given PCollection so that it will be processed at most once during pipeline execution.
cache() - Method in interface org.apache.crunch.PTable
 
cache(CachingOptions) - Method in interface org.apache.crunch.PTable
 
CachingOptions - Class in org.apache.crunch
Options for controlling how a PCollection<T> is cached for subsequent processing.
CachingOptions.Builder - Class in org.apache.crunch
A Builder class to use for setting the CachingOptions for a PCollection.
CachingOptions.Builder() - Constructor for class org.apache.crunch.CachingOptions.Builder
 
call(Tuple2<IntByteArray, List<byte[]>>) - Method in class org.apache.crunch.impl.spark.collect.ToByteArrayFunction
 
call(Iterator<Tuple2<K, V>>) - Method in class org.apache.crunch.impl.spark.fn.CombineMapsideFunction
 
call(Iterator<S>) - Method in class org.apache.crunch.impl.spark.fn.FlatMapDoFn
 
call(Iterator<Tuple2<K, V>>) - Method in class org.apache.crunch.impl.spark.fn.FlatMapPairDoFn
 
call(Tuple2<K, V>) - Method in class org.apache.crunch.impl.spark.fn.InputConverterFunction
 
call(Object) - Method in class org.apache.crunch.impl.spark.fn.MapFunction
 
call(Pair<K, V>) - Method in class org.apache.crunch.impl.spark.fn.MapOutputFunction
 
call(S) - Method in class org.apache.crunch.impl.spark.fn.OutputConverterFunction
 
call(Iterator<T>) - Method in class org.apache.crunch.impl.spark.fn.PairFlatMapDoFn
 
call(Iterator<Tuple2<K, V>>) - Method in class org.apache.crunch.impl.spark.fn.PairFlatMapPairDoFn
 
call(Tuple2<K, V>) - Method in class org.apache.crunch.impl.spark.fn.PairMapFunction
 
call(Pair<K, List<V>>) - Method in class org.apache.crunch.impl.spark.fn.PairMapIterableFunction
 
call(Pair<K, V>) - Method in class org.apache.crunch.impl.spark.fn.PartitionedMapOutputFunction
 
call(Iterator<Tuple2<ByteArray, List<byte[]>>>) - Method in class org.apache.crunch.impl.spark.fn.ReduceGroupingFunction
 
call(Tuple2<ByteArray, List<byte[]>>) - Method in class org.apache.crunch.impl.spark.fn.ReduceInputFunction
 
CAN_COMBINE_SPECIFIC_AND_REFLECT_SCHEMAS - Static variable in class org.apache.crunch.types.avro.Avros
Older versions of Avro (i.e., before 1.7.0) do not support schemas that are composed of a mix of specific and reflection-based schemas.
CappedExponentialCounter - Class in org.apache.crunch.impl.mr.exec
Generate a series of capped numbers exponentially.
CappedExponentialCounter(long, long) - Constructor for class org.apache.crunch.impl.mr.exec.CappedExponentialCounter
 
Cartesian - Class in org.apache.crunch.lib
Utilities for Cartesian products of two PTable or PCollection instances.
Cartesian() - Constructor for class org.apache.crunch.lib.Cartesian
 
Channels - Class in org.apache.crunch.lib
Utilities for splitting Pair instances emitted by DoFn into separate PCollection instances.
Channels() - Constructor for class org.apache.crunch.lib.Channels
 
checkCombiningSpecificAndReflectionSchemas() - Static method in class org.apache.crunch.types.avro.Avros
 
cleanup(Emitter<Pair<String, BloomFilter>>) - Method in class org.apache.crunch.contrib.bloomfilter.BloomFilterFn
 
cleanup(Emitter<T>) - Method in class org.apache.crunch.DoFn
Called during the cleanup of the MapReduce job this DoFn is associated with.
cleanup(Emitter<T>) - Method in class org.apache.crunch.FilterFn
 
cleanup() - Method in class org.apache.crunch.FilterFn
Called during the cleanup of the MapReduce job this FilterFn is associated with.
cleanup(Emitter<T>) - Method in class org.apache.crunch.fn.CompositeMapFn
 
cleanup(Emitter<Pair<S, T>>) - Method in class org.apache.crunch.fn.PairMapFn
 
cleanup(boolean) - Method in class org.apache.crunch.impl.dist.DistributedPipeline
 
cleanup(boolean) - Method in class org.apache.crunch.impl.mem.MemPipeline
 
cleanup(Mapper<Object, Object, Object, Object>.Context) - Method in class org.apache.crunch.impl.mr.run.CrunchMapper
 
cleanup(Reducer<Object, Object, Object, Object>.Context) - Method in class org.apache.crunch.impl.mr.run.CrunchReducer
 
cleanup() - Method in class org.apache.crunch.impl.mr.run.RTNode
 
cleanup(Emitter<Pair<Integer, Pair<K, V>>>) - Method in class org.apache.crunch.lib.Aggregate.TopKFn
 
cleanup(Emitter<Pair<K, Pair<U, V>>>) - Method in class org.apache.crunch.lib.join.FullOuterJoinFn
Called during the cleanup of the MapReduce job this DoFn is associated with.
cleanup(Emitter<Pair<K, Pair<U, V>>>) - Method in class org.apache.crunch.lib.join.LeftOuterJoinFn
Called during the cleanup of the MapReduce job this DoFn is associated with.
cleanup(boolean) - Method in interface org.apache.crunch.Pipeline
Cleans up any artifacts created as a result of running the pipeline.
clearCounters() - Static method in class org.apache.crunch.impl.mem.MemPipeline
 
clearCounters() - Static method in class org.apache.crunch.test.TestCounters
 
clearWritten(int) - Method in class org.apache.crunch.types.writable.TupleWritable
Record that the tuple does not contain an element at the position provided.
clearWritten() - Method in class org.apache.crunch.types.writable.TupleWritable
Clear any record of which writables have been written to, without releasing storage.
close(TaskAttemptContext) - Method in class org.apache.crunch.io.avro.trevni.TrevniRecordWriter
close() - Method in class org.apache.crunch.io.CrunchOutputs
 
close() - Method in class org.apache.crunch.io.impl.AutoClosingIterator
 
cogroup(PTable<K, U>) - Method in class org.apache.crunch.impl.dist.collect.PTableBase
 
cogroup(PTable<K, U>) - Method in class org.apache.crunch.impl.mem.collect.MemTable
 
Cogroup - Class in org.apache.crunch.lib
 
Cogroup() - Constructor for class org.apache.crunch.lib.Cogroup
 
cogroup(PTable<K, U>, PTable<K, V>) - Static method in class org.apache.crunch.lib.Cogroup
Co-groups the two PTable arguments.
cogroup(int, PTable<K, U>, PTable<K, V>) - Static method in class org.apache.crunch.lib.Cogroup
Co-groups the two PTable arguments with a user-specified degree of parallelism (a.k.a, number of reducers.)
cogroup(PTable<K, V1>, PTable<K, V2>, PTable<K, V3>) - Static method in class org.apache.crunch.lib.Cogroup
Co-groups the three PTable arguments.
cogroup(int, PTable<K, V1>, PTable<K, V2>, PTable<K, V3>) - Static method in class org.apache.crunch.lib.Cogroup
Co-groups the three PTable arguments with a user-specified degree of parallelism (a.k.a, number of reducers.)
cogroup(PTable<K, V1>, PTable<K, V2>, PTable<K, V3>, PTable<K, V4>) - Static method in class org.apache.crunch.lib.Cogroup
Co-groups the three PTable arguments.
cogroup(int, PTable<K, V1>, PTable<K, V2>, PTable<K, V3>, PTable<K, V4>) - Static method in class org.apache.crunch.lib.Cogroup
Co-groups the three PTable arguments with a user-specified degree of parallelism (a.k.a, number of reducers.)
cogroup(PTable<K, ?>, PTable<K, ?>...) - Static method in class org.apache.crunch.lib.Cogroup
Co-groups an arbitrary number of PTable arguments.
cogroup(int, PTable<K, ?>, PTable<K, ?>...) - Static method in class org.apache.crunch.lib.Cogroup
Co-groups an arbitrary number of PTable arguments with a user-specified degree of parallelism (a.k.a, number of reducers.) The largest table should come last in the ordering.
cogroup(PTable<K, U>) - Method in interface org.apache.crunch.PTable
Co-group operation with the given table on common keys.
CollectionDeepCopier<T> - Class in org.apache.crunch.types
Performs deep copies (based on underlying PType deep copying) of Collections.
CollectionDeepCopier(PType<T>) - Constructor for class org.apache.crunch.types.CollectionDeepCopier
 
collectionOf(T...) - Static method in class org.apache.crunch.impl.mem.MemPipeline
 
collectionOf(Iterable<T>) - Static method in class org.apache.crunch.impl.mem.MemPipeline
 
CollectionPObject<S> - Class in org.apache.crunch.materialize.pobject
A concrete implementation of PObjectImpl whose value is a Java Collection containing the elements of the underlying PCollection for this PObject.
CollectionPObject(PCollection<S>) - Constructor for class org.apache.crunch.materialize.pobject.CollectionPObject
Constructs a new instance of this PObject implementation.
collections(PType<T>) - Static method in class org.apache.crunch.types.avro.Avros
 
collections(PType<T>) - Method in class org.apache.crunch.types.avro.AvroTypeFamily
 
collections(PType<T>) - Method in interface org.apache.crunch.types.PTypeFamily
 
collections(PType<T>) - Static method in class org.apache.crunch.types.writable.Writables
 
collections(PType<T>) - Method in class org.apache.crunch.types.writable.WritableTypeFamily
 
collectValues() - Method in class org.apache.crunch.impl.dist.collect.PTableBase
 
collectValues() - Method in class org.apache.crunch.impl.mem.collect.MemTable
 
collectValues(PTable<K, V>) - Static method in class org.apache.crunch.lib.Aggregate
 
collectValues() - Method in interface org.apache.crunch.PTable
Aggregate all of the values with the same key into a single key-value pair in the returned PTable.
column() - Method in class org.apache.crunch.lib.Sort.ColumnOrder
 
COMBINE_FILE_BLOCK_SIZE - Static variable in class org.apache.crunch.impl.mr.run.RuntimeParameters
 
CombineFn<S,T> - Class in org.apache.crunch
A special DoFn implementation that converts an Iterable of values into a single value.
CombineFn() - Constructor for class org.apache.crunch.CombineFn
 
combineFn - Variable in class org.apache.crunch.impl.dist.collect.BaseDoTable
 
combineIntoRow(PCollection<KeyValue>) - Static method in class org.apache.crunch.io.hbase.HFileUtils
 
combineIntoRow(PCollection<KeyValue>, Scan) - Static method in class org.apache.crunch.io.hbase.HFileUtils
Converts a bunch of KeyValues into Result.
CombineMapsideFunction<K,V> - Class in org.apache.crunch.impl.spark.fn
 
CombineMapsideFunction(CombineFn<K, V>, SparkRuntimeContext) - Constructor for class org.apache.crunch.impl.spark.fn.CombineMapsideFunction
 
combineValues(CombineFn<K, V>, CombineFn<K, V>) - Method in class org.apache.crunch.impl.dist.collect.BaseGroupedTable
 
combineValues(CombineFn<K, V>) - Method in class org.apache.crunch.impl.dist.collect.BaseGroupedTable
 
combineValues(Aggregator<V>) - Method in class org.apache.crunch.impl.dist.collect.BaseGroupedTable
 
combineValues(Aggregator<V>, Aggregator<V>) - Method in class org.apache.crunch.impl.dist.collect.BaseGroupedTable
 
combineValues(CombineFn<K, V>) - Method in interface org.apache.crunch.PGroupedTable
Combines the values of this grouping using the given CombineFn.
combineValues(CombineFn<K, V>, CombineFn<K, V>) - Method in interface org.apache.crunch.PGroupedTable
Combines and reduces the values of this grouping using the given CombineFn instances.
combineValues(Aggregator<V>) - Method in interface org.apache.crunch.PGroupedTable
Combine the values in each group using the given Aggregator.
combineValues(Aggregator<V>, Aggregator<V>) - Method in interface org.apache.crunch.PGroupedTable
Combine and reduces the values in each group using the given Aggregator instances.
comm(PCollection<T>, PCollection<T>) - Static method in class org.apache.crunch.lib.Set
Find the elements that are common to two sets, like the Unix comm utility.
compare(ByteArray, ByteArray) - Method in class org.apache.crunch.impl.spark.SparkComparator
 
compare(byte[], int, int, byte[], int, int) - Method in class org.apache.crunch.io.hbase.HFileUtils.KeyValueComparator
 
compare(BytesWritable, BytesWritable) - Method in class org.apache.crunch.io.hbase.HFileUtils.KeyValueComparator
 
compare(Pair<K, V>, Pair<K, V>) - Method in class org.apache.crunch.lib.Aggregate.PairValueComparator
 
compare(AvroWrapper<T>, AvroWrapper<T>) - Method in class org.apache.crunch.lib.join.JoinUtils.AvroPairGroupingComparator
 
compare(byte[], int, int, byte[], int, int) - Method in class org.apache.crunch.lib.join.JoinUtils.AvroPairGroupingComparator
 
compare(TupleWritable, TupleWritable) - Method in class org.apache.crunch.lib.join.JoinUtils.TupleWritableComparator
 
compare(byte[], int, int, byte[], int, int) - Method in class org.apache.crunch.lib.join.JoinUtils.TupleWritableComparator
 
compare(AvroKey<T>, AvroKey<T>) - Method in class org.apache.crunch.lib.sort.ReverseAvroComparator
 
compare(byte[], int, int, byte[], int, int) - Method in class org.apache.crunch.lib.sort.ReverseAvroComparator
 
compare(byte[], int, int, byte[], int, int) - Method in class org.apache.crunch.lib.sort.ReverseWritableComparator
 
compare(T, T) - Method in class org.apache.crunch.lib.sort.ReverseWritableComparator
 
compare(WritableComparable, WritableComparable) - Method in class org.apache.crunch.lib.sort.TupleWritableComparator
 
compareTo(ByteArray) - Method in class org.apache.crunch.impl.spark.ByteArray
 
compareTo(Pair<K, V>) - Method in class org.apache.crunch.Pair
 
compareTo(TupleWritable) - Method in class org.apache.crunch.types.writable.TupleWritable
 
CompositeMapFn<R,S,T> - Class in org.apache.crunch.fn
 
CompositeMapFn(MapFn<R, S>, MapFn<S, T>) - Constructor for class org.apache.crunch.fn.CompositeMapFn
 
CompositePathIterable<T> - Class in org.apache.crunch.io
 
conf(String, String) - Method in class org.apache.crunch.GroupingOptions.Builder
 
conf(String, String) - Method in class org.apache.crunch.io.hbase.HBaseSourceTarget
 
conf(String, String) - Method in class org.apache.crunch.ParallelDoOptions.Builder
Specifies key-value pairs that should be added to the Configuration object associated with the Job that includes these options.
conf(String, String) - Method in interface org.apache.crunch.SourceTarget
Adds the given key-value pair to the Configuration instance(s) that are used to read and write this SourceTarget<T>.
configure(Configuration) - Method in class org.apache.crunch.DoFn
Configure this DoFn.
configure(Configuration) - Method in class org.apache.crunch.fn.CompositeMapFn
 
configure(Configuration) - Method in class org.apache.crunch.fn.ExtractKeyFn
 
configure(Configuration) - Method in class org.apache.crunch.fn.PairMapFn
 
configure(Job) - Method in class org.apache.crunch.GroupingOptions
 
configure(Target, PType<?>) - Method in class org.apache.crunch.impl.mr.plan.MSCROutputHandler
 
configure(Configuration) - Method in class org.apache.crunch.io.FormatBundle
 
configure(Configuration) - Method in class org.apache.crunch.io.hbase.HBaseData
 
configure(Configuration) - Method in class org.apache.crunch.io.impl.ReadableDataImpl
 
configure(Target, PType<?>) - Method in interface org.apache.crunch.io.OutputHandler
 
configure(Configuration) - Method in class org.apache.crunch.ParallelDoOptions
Applies the key-value pairs that were associated with this instance to the given Configuration object.
configure(Configuration) - Method in interface org.apache.crunch.ReadableData
Allows this instance to specify any additional configuration settings that may be needed by the job that it is launched in.
configure(Configuration) - Method in enum org.apache.crunch.types.avro.AvroMode
 
configure(FormatBundle) - Method in enum org.apache.crunch.types.avro.AvroMode
 
configure(Configuration) - Method in class org.apache.crunch.types.avro.AvroUtf8InputFormat
 
configure(Configuration) - Method in class org.apache.crunch.types.PGroupedTableType.PairIterableMapFn
 
configure(Configuration) - Method in class org.apache.crunch.util.DelegatingReadableData
 
configure(Configuration) - Method in class org.apache.crunch.util.UnionReadableData
 
configureForMapReduce(Job, PType<?>, Path, String) - Method in class org.apache.crunch.io.avro.AvroFileTarget
 
configureForMapReduce(Job, PType<?>, Path, String) - Method in class org.apache.crunch.io.avro.AvroPathPerKeyTarget
 
configureForMapReduce(Job, PType<?>, Path, String) - Method in class org.apache.crunch.io.avro.trevni.TrevniKeyTarget
 
configureForMapReduce(Job, PType<?>, Path, String) - Method in class org.apache.crunch.io.hbase.HBaseTarget
 
configureForMapReduce(Job, PType<?>, Path, String) - Method in class org.apache.crunch.io.hbase.HFileTarget
 
configureForMapReduce(Job, PType<?>, Path, String) - Method in class org.apache.crunch.io.impl.FileTargetImpl
 
configureForMapReduce(Job, Class, Class, Class, Path, String) - Method in class org.apache.crunch.io.impl.FileTargetImpl
Deprecated. 
configureForMapReduce(Job, Class, Class, FormatBundle, Path, String) - Method in class org.apache.crunch.io.impl.FileTargetImpl
 
configureForMapReduce(Job, PType<?>, Path, String) - Method in class org.apache.crunch.io.impl.SourcePathTargetImpl
 
configureForMapReduce(Job, PType<?>, Path, String) - Method in interface org.apache.crunch.io.MapReduceTarget
 
configureForMapReduce(Job, PType<?>, Path, String) - Method in class org.apache.crunch.io.parquet.AvroParquetFileTarget
 
configureForMapReduce(Job, PType<?>, Path, String) - Method in class org.apache.crunch.io.text.TextFileTarget
 
configureNode(DoNode, Target) - Method in class org.apache.crunch.impl.mr.plan.MSCROutputHandler
 
configureOrdering(Configuration, WritableType[], Sort.ColumnOrder[]) - Static method in class org.apache.crunch.lib.sort.TupleWritableComparator
 
configureReflectDataFactory(Configuration) - Static method in class org.apache.crunch.types.avro.Avros
Deprecated. as of 0.9.0; use AvroMode.REFLECT.configure(Configuration)
configureShuffle(Job) - Method in class org.apache.crunch.impl.mr.collect.PGroupedTableImpl
 
configureShuffle(Job, GroupingOptions) - Method in class org.apache.crunch.types.PGroupedTableType
 
configureSource(Job, int) - Method in class org.apache.crunch.io.hbase.HBaseSourceTarget
 
configureSource(Job, int) - Method in class org.apache.crunch.io.hbase.HFileSource
 
configureSource(Job, int) - Method in class org.apache.crunch.io.impl.FileSourceImpl
 
configureSource(Job, int) - Method in interface org.apache.crunch.Source
Configure the given job to use this source as an input.
containers(Class<T>) - Static method in class org.apache.crunch.types.avro.Avros
 
containers(Class<T>) - Method in class org.apache.crunch.types.avro.AvroTypeFamily
 
convert(PType<T>, PTypeFamily) - Static method in class org.apache.crunch.types.PTypeUtils
 
Converter<K,V,S,T> - Interface in org.apache.crunch.types
Converts the input key/value from a MapReduce task into the input to a DoFn, or takes the output of a DoFn and write it to the output key/values.
convertInput(Object, V) - Method in class org.apache.crunch.io.hbase.HBaseValueConverter
 
convertInput(K, V) - Method in interface org.apache.crunch.types.Converter
 
convertIterableInput(Object, Iterable<V>) - Method in class org.apache.crunch.io.hbase.HBaseValueConverter
 
convertIterableInput(K, Iterable<V>) - Method in interface org.apache.crunch.types.Converter
 
convertStringToScan(String) - Static method in class org.apache.crunch.io.hbase.HBaseSourceTarget
 
copyResourceFile(String) - Method in class org.apache.crunch.test.TemporaryPath
Copy a classpath resource to File.
copyResourceFileName(String) - Method in class org.apache.crunch.test.TemporaryPath
Copy a classpath resource returning its absolute file name.
copyResourcePath(String) - Method in class org.apache.crunch.test.TemporaryPath
Copy a classpath resource to a Path.
count() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
count() - Method in class org.apache.crunch.impl.mem.collect.MemCollection
 
count(PCollection<S>) - Static method in class org.apache.crunch.lib.Aggregate
Returns a PTable that contains the unique elements of this collection mapped to a count of their occurrences.
count(PCollection<S>, int) - Static method in class org.apache.crunch.lib.Aggregate
Returns a PTable that contains the unique elements of this collection mapped to a count of their occurrences.
count() - Method in interface org.apache.crunch.PCollection
Returns a PTable instance that contains the counts of each unique element of this PCollection.
CounterAccumulatorParam - Class in org.apache.crunch.impl.spark
 
CounterAccumulatorParam() - Constructor for class org.apache.crunch.impl.spark.CounterAccumulatorParam
 
create(String) - Method in class org.apache.crunch.contrib.text.TokenizerFactory
Return a Scanner instance that wraps the input string and uses the delimiter, skip, and locale settings for this TokenizerFactory instance.
create(FileSystem, Path, FileReaderFactory<S>) - Static method in class org.apache.crunch.io.CompositePathIterable
 
create() - Method in class org.apache.crunch.test.TemporaryPath
 
create(Class<T>, Class...) - Static method in class org.apache.crunch.types.TupleFactory
 
CREATE_DIR - Static variable in class org.apache.crunch.impl.mr.run.RuntimeParameters
 
createCombineNode() - Method in class org.apache.crunch.impl.mr.collect.DoTable
 
createDoCollection(String, PCollectionImpl<S>, DoFn<S, T>, PType<T>, ParallelDoOptions) - Method in interface org.apache.crunch.impl.dist.collect.PCollectionFactory
 
createDoCollection(String, PCollectionImpl<S>, DoFn<S, T>, PType<T>, ParallelDoOptions) - Method in class org.apache.crunch.impl.mr.collect.MRCollectionFactory
 
createDoCollection(String, PCollectionImpl<S>, DoFn<S, T>, PType<T>, ParallelDoOptions) - Method in class org.apache.crunch.impl.spark.collect.SparkCollectFactory
 
createDoNode() - Method in interface org.apache.crunch.impl.dist.collect.MRCollection
 
createDoNode() - Method in class org.apache.crunch.impl.mr.collect.DoCollection
 
createDoNode() - Method in class org.apache.crunch.impl.mr.collect.DoTable
 
createDoNode() - Method in class org.apache.crunch.impl.mr.collect.InputCollection
 
createDoNode() - Method in class org.apache.crunch.impl.mr.collect.InputTable
 
createDoNode() - Method in class org.apache.crunch.impl.mr.collect.PGroupedTableImpl
 
createDoTable(String, PCollectionImpl<S>, DoFn<S, Pair<K, V>>, PTableType<K, V>, ParallelDoOptions) - Method in interface org.apache.crunch.impl.dist.collect.PCollectionFactory
 
createDoTable(String, PCollectionImpl<S>, CombineFn<K, V>, DoFn<S, Pair<K, V>>, PTableType<K, V>) - Method in interface org.apache.crunch.impl.dist.collect.PCollectionFactory
 
createDoTable(String, PCollectionImpl<S>, DoFn<S, Pair<K, V>>, PTableType<K, V>, ParallelDoOptions) - Method in class org.apache.crunch.impl.mr.collect.MRCollectionFactory
 
createDoTable(String, PCollectionImpl<S>, CombineFn<K, V>, DoFn<S, Pair<K, V>>, PTableType<K, V>) - Method in class org.apache.crunch.impl.mr.collect.MRCollectionFactory
 
createDoTable(String, PCollectionImpl<S>, DoFn<S, Pair<K, V>>, PTableType<K, V>, ParallelDoOptions) - Method in class org.apache.crunch.impl.spark.collect.SparkCollectFactory
 
createDoTable(String, PCollectionImpl<S>, CombineFn<K, V>, DoFn<S, Pair<K, V>>, PTableType<K, V>) - Method in class org.apache.crunch.impl.spark.collect.SparkCollectFactory
 
createFilter(Path, BloomFilterFn<String>) - Static method in class org.apache.crunch.contrib.bloomfilter.BloomFilterFactory
The method will take an input path and generates BloomFilters for all text files in that path.
createFilter(PCollection<T>, BloomFilterFn<T>) - Static method in class org.apache.crunch.contrib.bloomfilter.BloomFilterFactory
 
createFnNode(String, DoFn<?, ?>, PType<?>, ParallelDoOptions) - Static method in class org.apache.crunch.impl.mr.plan.DoNode
 
createGroupedTable(PTableBase<K, V>, GroupingOptions) - Method in interface org.apache.crunch.impl.dist.collect.PCollectionFactory
 
createGroupedTable(PTableBase<K, V>, GroupingOptions) - Method in class org.apache.crunch.impl.mr.collect.MRCollectionFactory
 
createGroupedTable(PTableBase<K, V>, GroupingOptions) - Method in class org.apache.crunch.impl.spark.collect.SparkCollectFactory
 
createGroupingNode(String, PGroupedTableType<K, V>) - Static method in class org.apache.crunch.impl.mr.plan.DoNode
 
createInputCollection(Source<S>, DistributedPipeline) - Method in interface org.apache.crunch.impl.dist.collect.PCollectionFactory
 
createInputCollection(Source<S>, DistributedPipeline) - Method in class org.apache.crunch.impl.mr.collect.MRCollectionFactory
 
createInputCollection(Source<S>, DistributedPipeline) - Method in class org.apache.crunch.impl.spark.collect.SparkCollectFactory
 
createInputNode(Source<S>) - Static method in class org.apache.crunch.impl.mr.plan.DoNode
 
createInputTable(TableSource<K, V>, DistributedPipeline) - Method in interface org.apache.crunch.impl.dist.collect.PCollectionFactory
 
createInputTable(TableSource<K, V>, DistributedPipeline) - Method in class org.apache.crunch.impl.mr.collect.MRCollectionFactory
 
createInputTable(TableSource<K, V>, DistributedPipeline) - Method in class org.apache.crunch.impl.spark.collect.SparkCollectFactory
 
createIntermediateOutput(PType<T>) - Method in class org.apache.crunch.impl.dist.DistributedPipeline
 
createOrderedTupleSchema(PType<S>, Sort.ColumnOrder[]) - Static method in class org.apache.crunch.lib.sort.SortFns
Constructs an Avro schema for the given PType<S> that respects the given column orderings.
createOutputNode(String, Converter, PType<?>) - Static method in class org.apache.crunch.impl.mr.plan.DoNode
 
createPut(PTable<String, String>) - Method in class org.apache.crunch.examples.WordAggregationHBase
Create puts in order to insert them in hbase.
createRecordReader(InputSplit, TaskAttemptContext) - Method in class org.apache.crunch.impl.mr.run.CrunchCombineFileInputFormat
 
createRecordReader(InputSplit, TaskAttemptContext) - Method in class org.apache.crunch.impl.mr.run.CrunchInputFormat
 
createRecordReader(InputSplit, TaskAttemptContext) - Method in class org.apache.crunch.io.hbase.HFileInputFormat
 
createRecordReader(InputSplit, TaskAttemptContext) - Method in class org.apache.crunch.types.avro.AvroInputFormat
 
createRecordReader(InputSplit, TaskAttemptContext) - Method in class org.apache.crunch.types.avro.AvroUtf8InputFormat
 
createTempPath() - Method in class org.apache.crunch.impl.dist.DistributedPipeline
 
createUnionCollection(List<? extends PCollectionImpl<S>>) - Method in interface org.apache.crunch.impl.dist.collect.PCollectionFactory
 
createUnionCollection(List<? extends PCollectionImpl<S>>) - Method in class org.apache.crunch.impl.mr.collect.MRCollectionFactory
 
createUnionCollection(List<? extends PCollectionImpl<S>>) - Method in class org.apache.crunch.impl.spark.collect.SparkCollectFactory
 
createUnionTable(List<PTableBase<K, V>>) - Method in interface org.apache.crunch.impl.dist.collect.PCollectionFactory
 
createUnionTable(List<PTableBase<K, V>>) - Method in class org.apache.crunch.impl.mr.collect.MRCollectionFactory
 
createUnionTable(List<PTableBase<K, V>>) - Method in class org.apache.crunch.impl.spark.collect.SparkCollectFactory
 
cross(PTable<K1, U>, PTable<K2, V>) - Static method in class org.apache.crunch.lib.Cartesian
Performs a full cross join on the specified PTables (using the same strategy as Pig's CROSS operator).
cross(PTable<K1, U>, PTable<K2, V>, int) - Static method in class org.apache.crunch.lib.Cartesian
Performs a full cross join on the specified PTables (using the same strategy as Pig's CROSS operator).
cross(PCollection<U>, PCollection<V>) - Static method in class org.apache.crunch.lib.Cartesian
Performs a full cross join on the specified PCollections (using the same strategy as Pig's CROSS operator).
cross(PCollection<U>, PCollection<V>, int) - Static method in class org.apache.crunch.lib.Cartesian
Performs a full cross join on the specified PCollections (using the same strategy as Pig's CROSS operator).
CRUNCH_FILTER_NAME - Static variable in class org.apache.crunch.contrib.bloomfilter.BloomFilterFn
 
CRUNCH_FILTER_SIZE - Static variable in class org.apache.crunch.contrib.bloomfilter.BloomFilterFn
 
CRUNCH_INPUTS - Static variable in class org.apache.crunch.io.CrunchInputs
 
CRUNCH_OUTPUTS - Static variable in class org.apache.crunch.io.CrunchOutputs
 
CRUNCH_WORKING_DIRECTORY - Static variable in class org.apache.crunch.impl.mr.plan.PlanningParameters
 
CrunchCombineFileInputFormat<K,V> - Class in org.apache.crunch.impl.mr.run
 
CrunchCombineFileInputFormat(JobContext) - Constructor for class org.apache.crunch.impl.mr.run.CrunchCombineFileInputFormat
 
CrunchCombiner - Class in org.apache.crunch.impl.mr.run
 
CrunchCombiner() - Constructor for class org.apache.crunch.impl.mr.run.CrunchCombiner
 
CrunchInputFormat<K,V> - Class in org.apache.crunch.impl.mr.run
 
CrunchInputFormat() - Constructor for class org.apache.crunch.impl.mr.run.CrunchInputFormat
 
CrunchInputs - Class in org.apache.crunch.io
Helper functions for configuring multiple InputFormat instances within a single Crunch MapReduce job.
CrunchInputs() - Constructor for class org.apache.crunch.io.CrunchInputs
 
CrunchIterable<S,T> - Class in org.apache.crunch.impl.spark.fn
 
CrunchIterable(DoFn<S, T>, Iterator<S>) - Constructor for class org.apache.crunch.impl.spark.fn.CrunchIterable
 
CrunchJobHooks - Class in org.apache.crunch.impl.mr.exec
 
CrunchJobHooks.CompletionHook - Class in org.apache.crunch.impl.mr.exec
Moving output files produced by the MapReduce job to specified directories.
CrunchJobHooks.CompletionHook(Job, Path, Map<Integer, PathTarget>, boolean) - Constructor for class org.apache.crunch.impl.mr.exec.CrunchJobHooks.CompletionHook
 
CrunchJobHooks.PrepareHook - Class in org.apache.crunch.impl.mr.exec
Creates missing input directories before job is submitted.
CrunchJobHooks.PrepareHook(Job) - Constructor for class org.apache.crunch.impl.mr.exec.CrunchJobHooks.PrepareHook
 
CrunchMapper - Class in org.apache.crunch.impl.mr.run
 
CrunchMapper() - Constructor for class org.apache.crunch.impl.mr.run.CrunchMapper
 
CrunchOutputs<K,V> - Class in org.apache.crunch.io
An analogue of CrunchInputs for handling multiple OutputFormat instances writing to multiple files within a single MapReduce job.
CrunchOutputs(TaskInputOutputContext<?, ?, K, V>) - Constructor for class org.apache.crunch.io.CrunchOutputs
Creates and initializes multiple outputs support, it should be instantiated in the Mapper/Reducer setup method.
CrunchReducer - Class in org.apache.crunch.impl.mr.run
 
CrunchReducer() - Constructor for class org.apache.crunch.impl.mr.run.CrunchReducer
 
CrunchRuntimeException - Exception in org.apache.crunch
A RuntimeException implementation that includes some additional options for the Crunch execution engine to track reporting status.
CrunchRuntimeException(String) - Constructor for exception org.apache.crunch.CrunchRuntimeException
 
CrunchRuntimeException(Exception) - Constructor for exception org.apache.crunch.CrunchRuntimeException
 
CrunchRuntimeException(String, Exception) - Constructor for exception org.apache.crunch.CrunchRuntimeException
 
CrunchTestSupport - Class in org.apache.crunch.test
A temporary workaround for Scala tests to use when working with Rule annotations until it gets fixed in JUnit 4.11.
CrunchTestSupport() - Constructor for class org.apache.crunch.test.CrunchTestSupport
 
CrunchTool - Class in org.apache.crunch.util
An extension of the Tool interface that creates a Pipeline instance and provides methods for working with the Pipeline from inside of the Tool's run method.
CrunchTool() - Constructor for class org.apache.crunch.util.CrunchTool
 
CrunchTool(boolean) - Constructor for class org.apache.crunch.util.CrunchTool
 

D

DataBaseSource<T extends org.apache.hadoop.mapreduce.lib.db.DBWritable & org.apache.hadoop.io.Writable> - Class in org.apache.crunch.contrib.io.jdbc
Source from reading from a database via a JDBC connection.
DEBUG - Static variable in class org.apache.crunch.impl.mr.run.RuntimeParameters
 
DebugLogging - Class in org.apache.crunch.test
Allows direct manipulation of the Hadoop log4j settings to aid in unit testing.
DeepCopier<T> - Interface in org.apache.crunch.types
Performs deep copies of values.
DeepCopier.NoOpDeepCopier<V> - Class in org.apache.crunch.types
 
DeepCopier.NoOpDeepCopier() - Constructor for class org.apache.crunch.types.DeepCopier.NoOpDeepCopier
 
deepCopy(Collection<T>) - Method in class org.apache.crunch.types.CollectionDeepCopier
 
deepCopy(T) - Method in interface org.apache.crunch.types.DeepCopier
Create a deep copy of a value.
deepCopy(V) - Method in class org.apache.crunch.types.DeepCopier.NoOpDeepCopier
 
deepCopy(Map<String, T>) - Method in class org.apache.crunch.types.MapDeepCopier
 
deepCopy(T) - Method in class org.apache.crunch.types.TupleDeepCopier
 
deepCopy(T) - Method in class org.apache.crunch.types.writable.WritableDeepCopier
 
DEFAULT - Static variable in class org.apache.crunch.CachingOptions
An instance of CachingOptions with the default caching settings.
DEFAULT_BYTES_PER_REDUCE_TASK - Static variable in class org.apache.crunch.util.PartitionUtils
 
DEFAULT_MAX_REDUCERS - Static variable in class org.apache.crunch.util.PartitionUtils
 
DEFAULT_PATH - Static variable in class org.apache.crunch.lib.sort.TotalOrderPartitioner
 
DefaultJoinStrategy<K,U,V> - Class in org.apache.crunch.lib.join
Default join strategy that simply sends all data through the map, shuffle, and reduce phase.
DefaultJoinStrategy() - Constructor for class org.apache.crunch.lib.join.DefaultJoinStrategy
 
DefaultJoinStrategy(int) - Constructor for class org.apache.crunch.lib.join.DefaultJoinStrategy
 
DelegatingReadableData<S,T> - Class in org.apache.crunch.util
Implements the ReadableData<T> interface by delegating to an ReadableData<S> instance and passing its contents through a DoFn<S, T>.
DelegatingReadableData(ReadableData<S>, DoFn<S, T>) - Constructor for class org.apache.crunch.util.DelegatingReadableData
 
delete() - Method in class org.apache.crunch.test.TemporaryPath
 
deletes() - Static method in class org.apache.crunch.io.hbase.HBaseTypes
 
delimiter(String) - Method in class org.apache.crunch.contrib.text.TokenizerFactory.Builder
Sets the delimiter used by the TokenizerFactory instances constructed by this instance.
derived(PType<V1>, PType<V2>, PType<V3>) - Static method in class org.apache.crunch.Tuple3.Collect
 
derived(PType<V1>, PType<V2>, PType<V3>, PType<V4>) - Static method in class org.apache.crunch.Tuple4.Collect
 
derived(Class<T>, MapFn<S, T>, MapFn<T, S>, PType<S>) - Static method in class org.apache.crunch.types.avro.Avros
 
derived(Class<T>, MapFn<S, T>, MapFn<T, S>, PType<S>) - Method in class org.apache.crunch.types.avro.AvroTypeFamily
 
derived(Class<T>, MapFn<S, T>, MapFn<T, S>, PType<S>) - Method in interface org.apache.crunch.types.PTypeFamily
 
derived(Class<T>, MapFn<S, T>, MapFn<T, S>, PType<S>) - Static method in class org.apache.crunch.types.writable.Writables
 
derived(Class<T>, MapFn<S, T>, MapFn<T, S>, PType<S>) - Method in class org.apache.crunch.types.writable.WritableTypeFamily
 
deserialized(boolean) - Method in class org.apache.crunch.CachingOptions.Builder
 
deserialized() - Method in class org.apache.crunch.CachingOptions
Whether the data should remain deserialized in the cache, which trades off CPU processing time for additional storage overhead.
difference(PCollection<T>, PCollection<T>) - Static method in class org.apache.crunch.lib.Set
Compute the set difference between two sets of elements.
DISABLE_COMBINE_FILE - Static variable in class org.apache.crunch.impl.mr.run.RuntimeParameters
 
DISABLE_DEEP_COPY - Static variable in class org.apache.crunch.impl.mr.run.RuntimeParameters
 
disableDeepCopy() - Method in class org.apache.crunch.DoFn
By default, Crunch will do a defensive deep copy of the outputs of a DoFn when there are multiple downstream consumers of that item, in order to prevent the downstream functions from making concurrent modifications to data objects.
DistCache - Class in org.apache.crunch.util
Provides functions for working with Hadoop's distributed cache.
DistCache() - Constructor for class org.apache.crunch.util.DistCache
 
Distinct - Class in org.apache.crunch.lib
Functions for computing the distinct elements of a PCollection.
distinct(PCollection<S>) - Static method in class org.apache.crunch.lib.Distinct
Construct a new PCollection that contains the unique elements of a given input PCollection.
distinct(PTable<K, V>) - Static method in class org.apache.crunch.lib.Distinct
A PTable<K, V> analogue of the distinct function.
distinct(PCollection<S>, int) - Static method in class org.apache.crunch.lib.Distinct
A distinct operation that gives the client more control over how frequently elements are flushed to disk in order to allow control over performance or memory consumption.
distinct(PTable<K, V>, int) - Static method in class org.apache.crunch.lib.Distinct
A PTable<K, V> analogue of the distinct function.
DistributedPipeline - Class in org.apache.crunch.impl.dist
 
DistributedPipeline(String, Configuration, PCollectionFactory) - Constructor for class org.apache.crunch.impl.dist.DistributedPipeline
Instantiate with a custom name and configuration.
DoCollection<S> - Class in org.apache.crunch.impl.mr.collect
 
DoCollection<S> - Class in org.apache.crunch.impl.spark.collect
 
doCreate(Object[]) - Method in class org.apache.crunch.contrib.text.AbstractCompositeExtractor
Subclasses should return a new instance of the object based on the fields parsed by the Extractor instances for this composite Extractor instance.
doExtract(Tokenizer) - Method in class org.apache.crunch.contrib.text.AbstractSimpleExtractor
Subclasses must override this method to return a new instance of the class that this Extractor instance is designed to parse.
DoFn<S,T> - Class in org.apache.crunch
Base class for all data processing functions in Crunch.
DoFn() - Constructor for class org.apache.crunch.DoFn
 
DoFnIterator<S,T> - Class in org.apache.crunch.util
An Iterator<T> that combines a delegate Iterator<S> and a DoFn<S, T>, generating data by passing the contents of the iterator through the function.
DoFnIterator(Iterator<S>, DoFn<S, T>) - Constructor for class org.apache.crunch.util.DoFnIterator
 
done() - Method in class org.apache.crunch.impl.dist.DistributedPipeline
 
done() - Method in class org.apache.crunch.impl.mem.MemPipeline
 
done() - Method in class org.apache.crunch.impl.spark.SparkPipeline
 
done() - Method in interface org.apache.crunch.Pipeline
Run any remaining jobs required to generate outputs and then clean up any intermediate data files that were created in this run or previous calls to run.
done() - Method in class org.apache.crunch.util.CrunchTool
 
DoNode - Class in org.apache.crunch.impl.mr.plan
 
doOptions - Variable in class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
DoTable<K,V> - Class in org.apache.crunch.impl.mr.collect
 
DoTable<K,V> - Class in org.apache.crunch.impl.spark.collect
 
DotfileWriter - Class in org.apache.crunch.impl.mr.plan
Writes Graphviz dot files to illustrate the topology of Crunch pipelines.
DotfileWriter() - Constructor for class org.apache.crunch.impl.mr.plan.DotfileWriter
 
doubles() - Static method in class org.apache.crunch.types.avro.Avros
 
doubles() - Method in class org.apache.crunch.types.avro.AvroTypeFamily
 
doubles() - Method in interface org.apache.crunch.types.PTypeFamily
 
doubles() - Static method in class org.apache.crunch.types.writable.Writables
 
doubles() - Method in class org.apache.crunch.types.writable.WritableTypeFamily
 
drop(Integer...) - Method in class org.apache.crunch.contrib.text.TokenizerFactory.Builder
Drop the specified fields found by the input scanner, counting from zero.

E

emit(T) - Method in interface org.apache.crunch.Emitter
Write the emitted value to the next stage of the pipeline.
emit(T) - Method in class org.apache.crunch.impl.mem.emit.InMemoryEmitter
 
emit(Object) - Method in class org.apache.crunch.impl.mr.emit.IntermediateEmitter
 
emit(T) - Method in class org.apache.crunch.impl.mr.emit.MultipleOutputEmitter
 
emit(T) - Method in class org.apache.crunch.impl.mr.emit.OutputEmitter
 
Emitter<T> - Interface in org.apache.crunch
Interface for writing outputs from a DoFn.
EMPTY - Static variable in class org.apache.crunch.PipelineResult
 
enable(Level) - Static method in class org.apache.crunch.test.DebugLogging
Enables logging Hadoop output to the console using the pattern '%-4r [%t] %-5p %c %x - %m%n' at the specified Level.
enable(Level, Appender) - Static method in class org.apache.crunch.test.DebugLogging
Enables logging to the given Appender at the specified Level.
enableDebug() - Method in class org.apache.crunch.impl.dist.DistributedPipeline
 
enableDebug() - Method in class org.apache.crunch.impl.mem.MemPipeline
 
enableDebug() - Method in interface org.apache.crunch.Pipeline
Turn on debug logging for jobs that are run from this pipeline.
enableDebug() - Method in class org.apache.crunch.util.CrunchTool
 
entrySet() - Method in class org.apache.crunch.materialize.MaterializableMap
 
enums(Class<T>, PTypeFamily) - Static method in class org.apache.crunch.types.PTypes
 
equals(Object) - Method in class org.apache.crunch.impl.dist.collect.BaseInputCollection
 
equals(Object) - Method in class org.apache.crunch.impl.dist.collect.BaseInputTable
 
equals(Object) - Method in class org.apache.crunch.impl.mr.plan.DoNode
 
equals(Object) - Method in class org.apache.crunch.impl.spark.ByteArray
 
equals(Object) - Method in class org.apache.crunch.impl.spark.IntByteArray
 
equals(Object) - Method in class org.apache.crunch.io.FormatBundle
 
equals(Object) - Method in class org.apache.crunch.io.hbase.HBaseSourceTarget
 
equals(Object) - Method in class org.apache.crunch.io.hbase.HBaseTarget
 
equals(Object) - Method in class org.apache.crunch.io.impl.FileSourceImpl
 
equals(Object) - Method in class org.apache.crunch.io.impl.FileTargetImpl
 
equals(Object) - Method in class org.apache.crunch.Pair
 
equals(Object) - Method in class org.apache.crunch.Tuple3
 
equals(Object) - Method in class org.apache.crunch.Tuple4
 
equals(Object) - Method in class org.apache.crunch.TupleN
 
equals(Object) - Method in class org.apache.crunch.types.avro.AvroType
 
equals(Object) - Method in class org.apache.crunch.types.writable.TupleWritable
equals(Object) - Method in class org.apache.crunch.types.writable.WritableType
 
errorOnLastRecord() - Method in class org.apache.crunch.contrib.text.AbstractCompositeExtractor
 
errorOnLastRecord() - Method in class org.apache.crunch.contrib.text.AbstractSimpleExtractor
 
errorOnLastRecord() - Method in interface org.apache.crunch.contrib.text.Extractor
Returns true if the last call to extract on this instance threw an exception that was handled.
execute() - Method in class org.apache.crunch.impl.mr.exec.MRExecutor
 
execute() - Method in class org.apache.crunch.impl.spark.SparkRuntime
 
EXT - Static variable in class org.apache.crunch.io.avro.trevni.TrevniRecordWriter
trevni file extension
extract(String) - Method in class org.apache.crunch.contrib.text.AbstractCompositeExtractor
 
extract(String) - Method in class org.apache.crunch.contrib.text.AbstractSimpleExtractor
 
extract(String) - Method in interface org.apache.crunch.contrib.text.Extractor
Extract a value with the type of this instance.
extractKey(String) - Static method in class org.apache.crunch.types.Protos
 
ExtractKeyFn<K,V> - Class in org.apache.crunch.fn
Wrapper function for converting a MapFn into a key-value pair that is used to convert from a PCollection<V> to a PTable<K, V>.
ExtractKeyFn(MapFn<V, K>) - Constructor for class org.apache.crunch.fn.ExtractKeyFn
 
Extractor<T> - Interface in org.apache.crunch.contrib.text
An interface for extracting a specific data type from a text string that is being processed by a Scanner object.
Extractors - Class in org.apache.crunch.contrib.text
Factory methods for constructing common Extractor types.
Extractors() - Constructor for class org.apache.crunch.contrib.text.Extractors
 
ExtractorStats - Class in org.apache.crunch.contrib.text
Records the number of kind of errors that an Extractor encountered when parsing input data.
ExtractorStats(int) - Constructor for class org.apache.crunch.contrib.text.ExtractorStats
 
ExtractorStats(int, List<Integer>) - Constructor for class org.apache.crunch.contrib.text.ExtractorStats
 
extractPartitionNumber(String) - Static method in class org.apache.crunch.io.impl.FileTargetImpl
Extract the partition number from a raw reducer output filename.
extractText(PTable<ImmutableBytesWritable, Result>) - Method in class org.apache.crunch.examples.WordAggregationHBase
Extract information from hbase

F

factory - Variable in class org.apache.crunch.impl.dist.DistributedPipeline
 
FileNamingScheme - Interface in org.apache.crunch.io
Encapsulates rules for naming output files.
FileReaderFactory<T> - Interface in org.apache.crunch.io
 
FileSourceImpl<T> - Class in org.apache.crunch.io.impl
 
FileSourceImpl(Path, PType<T>, Class<? extends InputFormat>) - Constructor for class org.apache.crunch.io.impl.FileSourceImpl
 
FileSourceImpl(Path, PType<T>, FormatBundle<? extends InputFormat>) - Constructor for class org.apache.crunch.io.impl.FileSourceImpl
 
FileSourceImpl(List<Path>, PType<T>, Class<? extends InputFormat>) - Constructor for class org.apache.crunch.io.impl.FileSourceImpl
 
FileSourceImpl(List<Path>, PType<T>, FormatBundle<? extends InputFormat>) - Constructor for class org.apache.crunch.io.impl.FileSourceImpl
 
FileTableSourceImpl<K,V> - Class in org.apache.crunch.io.impl
 
FileTableSourceImpl(Path, PTableType<K, V>, Class<? extends FileInputFormat>) - Constructor for class org.apache.crunch.io.impl.FileTableSourceImpl
 
FileTableSourceImpl(List<Path>, PTableType<K, V>, Class<? extends FileInputFormat>) - Constructor for class org.apache.crunch.io.impl.FileTableSourceImpl
 
FileTableSourceImpl(Path, PTableType<K, V>, FormatBundle) - Constructor for class org.apache.crunch.io.impl.FileTableSourceImpl
 
FileTableSourceImpl(List<Path>, PTableType<K, V>, FormatBundle) - Constructor for class org.apache.crunch.io.impl.FileTableSourceImpl
 
FileTargetImpl - Class in org.apache.crunch.io.impl
 
FileTargetImpl(Path, Class<? extends FileOutputFormat>, FileNamingScheme) - Constructor for class org.apache.crunch.io.impl.FileTargetImpl
 
FileTargetImpl(Path, Class<? extends FileOutputFormat>, FileNamingScheme, Map<String, String>) - Constructor for class org.apache.crunch.io.impl.FileTargetImpl
 
filter(FilterFn<S>) - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
filter(String, FilterFn<S>) - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
filter(FilterFn<Pair<K, V>>) - Method in class org.apache.crunch.impl.dist.collect.PTableBase
 
filter(String, FilterFn<Pair<K, V>>) - Method in class org.apache.crunch.impl.dist.collect.PTableBase
 
filter(FilterFn<S>) - Method in class org.apache.crunch.impl.mem.collect.MemCollection
 
filter(String, FilterFn<S>) - Method in class org.apache.crunch.impl.mem.collect.MemCollection
 
filter(FilterFn<Pair<K, V>>) - Method in class org.apache.crunch.impl.mem.collect.MemTable
 
filter(String, FilterFn<Pair<K, V>>) - Method in class org.apache.crunch.impl.mem.collect.MemTable
 
filter(FilterFn<S>) - Method in interface org.apache.crunch.PCollection
Apply the given filter function to this instance and return the resulting PCollection.
filter(String, FilterFn<S>) - Method in interface org.apache.crunch.PCollection
Apply the given filter function to this instance and return the resulting PCollection.
filter(FilterFn<Pair<K, V>>) - Method in interface org.apache.crunch.PTable
Apply the given filter function to this instance and return the resulting PTable.
filter(String, FilterFn<Pair<K, V>>) - Method in interface org.apache.crunch.PTable
Apply the given filter function to this instance and return the resulting PTable.
filterClass(Class<? extends UnboundRecordFilter>) - Method in class org.apache.crunch.io.parquet.AvroParquetFileSource.Builder
 
FilterFn<T> - Class in org.apache.crunch
A DoFn for the common case of filtering the members of a PCollection based on a boolean condition.
FilterFn() - Constructor for class org.apache.crunch.FilterFn
 
FilterFns - Class in org.apache.crunch.fn
A collection of pre-defined FilterFn implementations.
findContainingJar(Class<?>) - Static method in class org.apache.crunch.util.DistCache
Finds the path to a jar that contains the class provided, if any.
findCounter(Enum<?>) - Method in class org.apache.crunch.PipelineResult.StageResult
Deprecated. The Counter class changed incompatibly between Hadoop 1 and 2 (from a class to an interface) so user programs should avoid this method and use PipelineResult.StageResult.getCounterValue(Enum) and/or PipelineResult.StageResult.getCounterDisplayName(Enum).
first() - Method in class org.apache.crunch.Pair
 
first() - Method in class org.apache.crunch.Tuple3
 
first() - Method in class org.apache.crunch.Tuple4
 
FIRST_N(int) - Static method in class org.apache.crunch.fn.Aggregators
Return the first n values (or fewer if there are fewer values than n).
FirstElementPObject<T> - Class in org.apache.crunch.materialize.pobject
A concrete implementation of PObjectImpl that uses the first element in the backing PCollection as the PObject value.
FirstElementPObject(PCollection<T>) - Constructor for class org.apache.crunch.materialize.pobject.FirstElementPObject
Constructs a new instance of this PObject implementation.
FlatMapDoFn<S,T> - Class in org.apache.crunch.impl.spark.fn
 
FlatMapDoFn(DoFn<S, T>, SparkRuntimeContext) - Constructor for class org.apache.crunch.impl.spark.fn.FlatMapDoFn
 
FlatMapPairDoFn<K,V,T> - Class in org.apache.crunch.impl.spark.fn
 
FlatMapPairDoFn(DoFn<Pair<K, V>, T>, SparkRuntimeContext) - Constructor for class org.apache.crunch.impl.spark.fn.FlatMapPairDoFn
 
floats() - Static method in class org.apache.crunch.types.avro.Avros
 
floats() - Method in class org.apache.crunch.types.avro.AvroTypeFamily
 
floats() - Method in interface org.apache.crunch.types.PTypeFamily
 
floats() - Static method in class org.apache.crunch.types.writable.Writables
 
floats() - Method in class org.apache.crunch.types.writable.WritableTypeFamily
 
flush() - Method in interface org.apache.crunch.Emitter
Flushes any values cached by this emitter.
flush() - Method in class org.apache.crunch.impl.mem.emit.InMemoryEmitter
 
flush() - Method in class org.apache.crunch.impl.mr.emit.IntermediateEmitter
 
flush() - Method in class org.apache.crunch.impl.mr.emit.MultipleOutputEmitter
 
flush() - Method in class org.apache.crunch.impl.mr.emit.OutputEmitter
 
flush() - Method in class org.apache.crunch.io.avro.trevni.TrevniRecordWriter
A Trevni flush will close the current file and prep a new writer
fn - Variable in class org.apache.crunch.impl.dist.collect.BaseDoCollection
 
fn - Variable in class org.apache.crunch.impl.dist.collect.BaseDoTable
 
forInput(Class<T>) - Static method in class org.apache.crunch.io.FormatBundle
 
FormatBundle<K> - Class in org.apache.crunch.io
A combination of an InputFormat or OutputFormat and any extra configuration information that format class needs to run.
FormatBundle() - Constructor for class org.apache.crunch.io.FormatBundle
 
formattedFile(String, Class<? extends FileInputFormat<K, V>>, Class<K>, Class<V>) - Static method in class org.apache.crunch.io.From
Creates a TableSource<K, V> for reading data from files that have custom FileInputFormat<K, V> implementations not covered by the provided TableSource and Source factory methods.
formattedFile(Path, Class<? extends FileInputFormat<K, V>>, Class<K>, Class<V>) - Static method in class org.apache.crunch.io.From
Creates a TableSource<K, V> for reading data from files that have custom FileInputFormat<K, V> implementations not covered by the provided TableSource and Source factory methods.
formattedFile(String, Class<? extends FileInputFormat<?, ?>>, PType<K>, PType<V>) - Static method in class org.apache.crunch.io.From
Creates a TableSource<K, V> for reading data from files that have custom FileInputFormat implementations not covered by the provided TableSource and Source factory methods.
formattedFile(Path, Class<? extends FileInputFormat<?, ?>>, PType<K>, PType<V>) - Static method in class org.apache.crunch.io.From
Creates a TableSource<K, V> for reading data from files that have custom FileInputFormat implementations not covered by the provided TableSource and Source factory methods.
formattedFile(String, Class<? extends FileOutputFormat<K, V>>) - Static method in class org.apache.crunch.io.To
Creates a Target at the given path name that writes data to a custom FileOutputFormat.
formattedFile(Path, Class<? extends FileOutputFormat<K, V>>) - Static method in class org.apache.crunch.io.To
Creates a Target at the given Path that writes data to a custom FileOutputFormat.
forOutput(Class<T>) - Static method in class org.apache.crunch.io.FormatBundle
 
fourth() - Method in class org.apache.crunch.Tuple4
 
From - Class in org.apache.crunch.io
Static factory methods for creating common Source types.
From() - Constructor for class org.apache.crunch.io.From
 
from - Static variable in class org.apache.crunch.util.CrunchTool
 
fromBytes(byte[]) - Method in class org.apache.crunch.impl.spark.serde.AvroSerDe
 
fromBytes(byte[]) - Method in interface org.apache.crunch.impl.spark.serde.SerDe
 
fromBytes(byte[]) - Method in class org.apache.crunch.impl.spark.serde.WritableSerDe
 
fromBytesFunction() - Method in class org.apache.crunch.impl.spark.serde.AvroSerDe
 
fromBytesFunction() - Method in interface org.apache.crunch.impl.spark.serde.SerDe
 
fromBytesFunction() - Method in class org.apache.crunch.impl.spark.serde.WritableSerDe
 
fromConfiguration(Configuration) - Static method in enum org.apache.crunch.types.avro.AvroMode
 
FromHBase - Class in org.apache.crunch.io.hbase
Static factory methods for creating HBase Source types.
FromHBase() - Constructor for class org.apache.crunch.io.hbase.FromHBase
 
fromSerialized(String, Configuration) - Static method in class org.apache.crunch.io.FormatBundle
 
fromType(AvroType<?>) - Static method in enum org.apache.crunch.types.avro.AvroMode
 
fullJoin(PTable<K, U>, PTable<K, V>) - Static method in class org.apache.crunch.lib.Join
Performs a full outer join on the specified PTables.
FullOuterJoinFn<K,U,V> - Class in org.apache.crunch.lib.join
Used to perform the last step of an full outer join.
FullOuterJoinFn(PType<K>, PType<U>) - Constructor for class org.apache.crunch.lib.join.FullOuterJoinFn
 

G

generateKeys(S) - Method in class org.apache.crunch.contrib.bloomfilter.BloomFilterFn
 
generics(Schema) - Static method in class org.apache.crunch.types.avro.Avros
 
generics(Schema) - Method in class org.apache.crunch.types.avro.AvroTypeFamily
 
get() - Method in class org.apache.crunch.impl.mr.exec.CappedExponentialCounter
 
get() - Method in class org.apache.crunch.impl.mr.exec.MRExecutor
 
get(long, TimeUnit) - Method in class org.apache.crunch.impl.mr.exec.MRExecutor
 
get() - Method in class org.apache.crunch.impl.spark.SparkRuntime
 
get(long, TimeUnit) - Method in class org.apache.crunch.impl.spark.SparkRuntime
 
get(int) - Method in class org.apache.crunch.Pair
 
get(int) - Method in interface org.apache.crunch.Tuple
Returns the Object at the given index.
get(int) - Method in class org.apache.crunch.Tuple3
 
get(int) - Method in class org.apache.crunch.Tuple4
 
get(int) - Method in class org.apache.crunch.TupleN
 
get(int) - Method in class org.apache.crunch.types.writable.TupleWritable
Get ith Writable from Tuple.
getBundle() - Method in class org.apache.crunch.io.impl.FileSourceImpl
 
getByFn() - Method in class org.apache.crunch.lib.sort.SortFns.KeyExtraction
 
getChainingCollection() - Method in class org.apache.crunch.impl.dist.collect.BaseGroupedTable
 
getChainingCollection() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
Retrieve the PCollectionImpl to be used for chaining within PCollectionImpls further down the pipeline.
getChildren() - Method in class org.apache.crunch.impl.mr.plan.DoNode
 
getCollection() - Method in class org.apache.crunch.impl.mem.collect.MemCollection
 
getCombineFn() - Method in class org.apache.crunch.impl.spark.SparkRuntime
 
getConf() - Method in class org.apache.crunch.io.FormatBundle
 
getConf() - Method in class org.apache.crunch.lib.sort.TotalOrderPartitioner
 
getConf() - Method in class org.apache.crunch.lib.sort.TupleWritableComparator
 
getConf() - Method in class org.apache.crunch.util.CrunchTool
 
getConfiguration() - Method in class org.apache.crunch.DoFn
 
getConfiguration() - Method in class org.apache.crunch.impl.dist.DistributedPipeline
 
getConfiguration() - Method in class org.apache.crunch.impl.mem.MemPipeline
 
getConfiguration() - Method in class org.apache.crunch.impl.spark.SparkRuntime
 
getConfiguration() - Method in class org.apache.crunch.impl.spark.SparkRuntimeContext
 
getConfiguration() - Method in interface org.apache.crunch.Pipeline
Returns the Configuration instance associated with this pipeline.
getConfigurationKey() - Method in enum org.apache.crunch.impl.mr.run.NodeContext
 
getContext() - Method in class org.apache.crunch.DoFn
 
getConverter() - Method in class org.apache.crunch.io.hbase.HBaseSourceTarget
 
getConverter(PType<?>) - Method in class org.apache.crunch.io.hbase.HBaseTarget
 
getConverter() - Method in class org.apache.crunch.io.hbase.HFileSource
 
getConverter(PType<?>) - Method in class org.apache.crunch.io.hbase.HFileTarget
 
getConverter() - Method in class org.apache.crunch.io.impl.FileSourceImpl
 
getConverter(PType<?>) - Method in class org.apache.crunch.io.impl.FileTargetImpl
 
getConverter() - Method in class org.apache.crunch.io.parquet.AvroParquetFileSource
 
getConverter(PType<?>) - Method in class org.apache.crunch.io.parquet.AvroParquetFileTarget
 
getConverter() - Method in interface org.apache.crunch.Source
Returns the Converter used for mapping the inputs from this instance into PCollection or PTable values.
getConverter(PType<?>) - Method in interface org.apache.crunch.Target
Returns the Converter to use for mapping from the output PCollection into the output values expected by this instance.
getConverter() - Method in class org.apache.crunch.types.avro.AvroType
 
getConverter() - Method in class org.apache.crunch.types.PGroupedTableType
 
getConverter() - Method in interface org.apache.crunch.types.PType
 
getConverter() - Method in class org.apache.crunch.types.writable.WritableType
 
getCounter(Enum<?>) - Method in class org.apache.crunch.DoFn
Deprecated. The Counter class changed incompatibly between Hadoop 1 and 2 (from a class to an interface) so user programs should avoid this method and use one of the increment methods instead, such as DoFn.increment(Enum).
getCounter(String, String) - Method in class org.apache.crunch.DoFn
Deprecated. The Counter class changed incompatibly between Hadoop 1 and 2 (from a class to an interface) so user programs should avoid this method and use one of the increment methods instead, such as DoFn.increment(Enum).
getCounter(Enum<?>) - Static method in class org.apache.crunch.test.TestCounters
 
getCounter(String, String) - Static method in class org.apache.crunch.test.TestCounters
 
getCounterDisplayName(String, String) - Method in class org.apache.crunch.PipelineResult.StageResult
 
getCounterDisplayName(Enum<?>) - Method in class org.apache.crunch.PipelineResult.StageResult
 
getCounterNames() - Method in class org.apache.crunch.PipelineResult.StageResult
 
getCounters() - Static method in class org.apache.crunch.impl.mem.MemPipeline
 
getCounters() - Method in class org.apache.crunch.PipelineResult.StageResult
Deprecated. The Counter class changed incompatibly between Hadoop 1 and 2 (from a class to an interface) so user programs should avoid this method and use PipelineResult.StageResult.getCounterNames().
getCounterValue(String, String) - Method in class org.apache.crunch.PipelineResult.StageResult
 
getCounterValue(Enum<?>) - Method in class org.apache.crunch.PipelineResult.StageResult
 
getData() - Method in enum org.apache.crunch.types.avro.AvroMode
 
getData() - Method in interface org.apache.crunch.types.avro.ReaderWriterFactory
 
getData() - Method in class org.apache.crunch.types.avro.ReflectDataFactory
 
getDataFileWriter(Path, Configuration) - Static method in class org.apache.crunch.types.avro.AvroOutputFormat
 
getDefaultConfiguration() - Method in class org.apache.crunch.test.TemporaryPath
 
getDefaultFileSource(Path) - Method in class org.apache.crunch.types.avro.AvroType
 
getDefaultFileSource(Path) - Method in class org.apache.crunch.types.PGroupedTableType
 
getDefaultFileSource(Path) - Method in interface org.apache.crunch.types.PType
Returns a SourceTarget that is able to read/write data using the serialization format specified by this PType.
getDefaultFileSource(Path) - Method in class org.apache.crunch.types.writable.WritableType
 
getDefaultInstance() - Static method in class org.apache.crunch.contrib.text.TokenizerFactory
Returns a default TokenizerFactory that uses whitespace as a delimiter and does not skip any input fields.
getDefaultInstance(Class<M>) - Static method in class org.apache.crunch.types.Protos
Utility function for creating a default PB Messgae from a Class object that works with both protoc 2.3.0 and 2.4.x.
getDefaultValue() - Method in class org.apache.crunch.contrib.text.AbstractSimpleExtractor
 
getDefaultValue() - Method in interface org.apache.crunch.contrib.text.Extractor
Returns the default value for this Extractor in case of an error.
getDependentJobs() - Method in interface org.apache.crunch.impl.mr.MRJob
 
getDepth() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
getDestFile(Configuration, Path, Path, boolean) - Method in class org.apache.crunch.io.avro.trevni.TrevniKeyTarget
 
getDestFile(Configuration, Path, Path, boolean) - Method in class org.apache.crunch.io.impl.FileTargetImpl
 
getDetachedValue(PTableType<K, V>, Pair<K, V>) - Static method in class org.apache.crunch.lib.PTables
Create a detached value for a table Pair.
getDetachedValue(T) - Method in class org.apache.crunch.types.avro.AvroType
 
getDetachedValue(T) - Method in interface org.apache.crunch.types.PType
Returns a copy of a value (or the value itself) that can safely be retained.
getDetachedValue(T) - Method in class org.apache.crunch.types.writable.WritableType
 
getErrorCount() - Method in class org.apache.crunch.contrib.text.ExtractorStats
The overall number of records that had some kind of parsing error.
getFactory() - Method in class org.apache.crunch.impl.dist.DistributedPipeline
 
getFamily() - Method in class org.apache.crunch.types.avro.AvroType
 
getFamily() - Method in class org.apache.crunch.types.PGroupedTableType
 
getFamily() - Method in interface org.apache.crunch.types.PType
Returns the PTypeFamily that this PType belongs to.
getFamily() - Method in class org.apache.crunch.types.writable.WritableType
 
getFieldErrors() - Method in class org.apache.crunch.contrib.text.ExtractorStats
Returns the number of errors that occurred when parsing the individual fields of a composite record type, like a Pair or TupleN.
getFile(String) - Method in class org.apache.crunch.test.TemporaryPath
Get a File below the temporary directory.
getFileName(String) - Method in class org.apache.crunch.test.TemporaryPath
Get an absolute file name below the temporary directory.
getFileNamingScheme() - Method in class org.apache.crunch.io.impl.FileTargetImpl
 
getFileNamingScheme() - Method in class org.apache.crunch.io.impl.SourcePathTargetImpl
 
getFileNamingScheme() - Method in interface org.apache.crunch.io.PathTarget
Get the naming scheme to be used for outputs being written to an output path.
getFileReaderFactory(AvroType<T>) - Method in class org.apache.crunch.io.avro.AvroFileSource
 
getFileReaderFactory() - Method in class org.apache.crunch.io.avro.AvroReadableData
 
getFileReaderFactory() - Method in class org.apache.crunch.io.avro.trevni.TrevniReadableData
 
getFileReaderFactory() - Method in class org.apache.crunch.io.hbase.HFileReadableData
 
getFileReaderFactory() - Method in class org.apache.crunch.io.impl.ReadableDataImpl
 
getFileReaderFactory(AvroType<T>) - Method in class org.apache.crunch.io.parquet.AvroParquetFileSource
 
getFileReaderFactory() - Method in class org.apache.crunch.io.parquet.AvroParquetReadableData
 
getFileReaderFactory() - Method in class org.apache.crunch.io.seq.SeqFileReadableData
 
getFileReaderFactory() - Method in class org.apache.crunch.io.text.TextReadableData
 
getFirst() - Method in class org.apache.crunch.fn.CompositeMapFn
 
getFormatClass() - Method in class org.apache.crunch.io.FormatBundle
 
getFormatNodeMap(JobContext) - Static method in class org.apache.crunch.io.CrunchInputs
 
getGroupedDetachedValue(PGroupedTableType<K, V>, Pair<K, Iterable<V>>) - Static method in class org.apache.crunch.lib.PTables
Created a detached value for a PGroupedTable value.
getGroupedTableType() - Method in class org.apache.crunch.impl.dist.collect.BaseGroupedTable
 
getGroupedTableType() - Method in interface org.apache.crunch.PGroupedTable
Return the PGroupedTableType containing serialization information for this PGroupedTable.
getGroupedTableType() - Method in interface org.apache.crunch.types.PTableType
Returns the grouped table version of this type.
getGroupingComparator(PTypeFamily) - Static method in class org.apache.crunch.lib.join.JoinUtils
 
getGroupingComparatorClass() - Method in class org.apache.crunch.GroupingOptions
 
getGroupingConverter() - Method in class org.apache.crunch.types.PGroupedTableType
 
getGroupingNode() - Method in class org.apache.crunch.impl.mr.collect.PGroupedTableImpl
 
getInputMapFn() - Method in class org.apache.crunch.types.avro.AvroType
 
getInputMapFn() - Method in interface org.apache.crunch.types.PType
 
getInputMapFn() - Method in class org.apache.crunch.types.writable.WritableType
 
getInstance() - Static method in class org.apache.crunch.fn.IdentityFn
 
getInstance() - Static method in class org.apache.crunch.impl.mem.MemPipeline
 
getInstance() - Static method in class org.apache.crunch.io.SequentialFileNamingScheme
 
getInstance() - Static method in class org.apache.crunch.types.avro.AvroTypeFamily
 
getInstance() - Static method in class org.apache.crunch.types.writable.WritableTypeFamily
 
getJavaRDDLike(SparkRuntime) - Method in class org.apache.crunch.impl.spark.collect.DoCollection
 
getJavaRDDLike(SparkRuntime) - Method in class org.apache.crunch.impl.spark.collect.DoTable
 
getJavaRDDLike(SparkRuntime) - Method in class org.apache.crunch.impl.spark.collect.InputCollection
 
getJavaRDDLike(SparkRuntime) - Method in class org.apache.crunch.impl.spark.collect.InputTable
 
getJavaRDDLike(SparkRuntime) - Method in class org.apache.crunch.impl.spark.collect.PGroupedTableImpl
 
getJavaRDDLike(SparkRuntime) - Method in class org.apache.crunch.impl.spark.collect.UnionCollection
 
getJavaRDDLike(SparkRuntime) - Method in class org.apache.crunch.impl.spark.collect.UnionTable
 
getJavaRDDLike(SparkRuntime) - Method in interface org.apache.crunch.impl.spark.SparkCollection
 
getJob() - Method in interface org.apache.crunch.impl.mr.MRJob
 
getJobID() - Method in interface org.apache.crunch.impl.mr.MRJob
 
getJobs() - Method in class org.apache.crunch.impl.mr.exec.MRExecutor
 
getJobs() - Method in interface org.apache.crunch.impl.mr.MRPipelineExecution
 
getJobState() - Method in interface org.apache.crunch.impl.mr.MRJob
 
getJoinType() - Method in class org.apache.crunch.lib.join.FullOuterJoinFn
getJoinType() - Method in class org.apache.crunch.lib.join.InnerJoinFn
 
getJoinType() - Method in class org.apache.crunch.lib.join.JoinFn
 
getJoinType() - Method in class org.apache.crunch.lib.join.LeftOuterJoinFn
getJoinType() - Method in class org.apache.crunch.lib.join.RightOuterJoinFn
getKeyClass() - Method in class org.apache.crunch.io.hbase.HBaseValueConverter
 
getKeyClass() - Method in interface org.apache.crunch.types.Converter
 
getKeyType() - Method in class org.apache.crunch.impl.dist.collect.PTableBase
 
getKeyType() - Method in class org.apache.crunch.impl.mem.collect.MemTable
 
getKeyType() - Method in class org.apache.crunch.lib.sort.SortFns.KeyExtraction
 
getKeyType() - Method in interface org.apache.crunch.PTable
Returns the PType of the key.
getKeyType() - Method in interface org.apache.crunch.types.PTableType
Returns the key type for the table.
getLastModifiedAt(Configuration) - Method in class org.apache.crunch.contrib.io.jdbc.DataBaseSource
 
getLastModifiedAt() - Method in class org.apache.crunch.impl.dist.collect.BaseDoCollection
 
getLastModifiedAt() - Method in class org.apache.crunch.impl.dist.collect.BaseDoTable
 
getLastModifiedAt() - Method in class org.apache.crunch.impl.dist.collect.BaseGroupedTable
 
getLastModifiedAt() - Method in class org.apache.crunch.impl.dist.collect.BaseInputCollection
 
getLastModifiedAt() - Method in class org.apache.crunch.impl.dist.collect.BaseInputTable
 
getLastModifiedAt() - Method in class org.apache.crunch.impl.dist.collect.BaseUnionCollection
 
getLastModifiedAt() - Method in class org.apache.crunch.impl.dist.collect.BaseUnionTable
 
getLastModifiedAt() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
getLastModifiedAt(Configuration) - Method in class org.apache.crunch.io.hbase.HBaseSourceTarget
 
getLastModifiedAt(Configuration) - Method in class org.apache.crunch.io.impl.FileSourceImpl
 
getLastModifiedAt(FileSystem, Path) - Static method in class org.apache.crunch.io.SourceTargetHelper
 
getLastModifiedAt(Configuration) - Method in interface org.apache.crunch.Source
Returns the time (in milliseconds) that this Source was most recently modified (e.g., because an input file was edited or new files were added to a directory.)
getMapOutputName(Configuration, Path) - Method in interface org.apache.crunch.io.FileNamingScheme
Get the output file name for a map task.
getMapOutputName(Configuration, Path) - Method in class org.apache.crunch.io.SequentialFileNamingScheme
 
getMaterializedAt() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
getMaterializeSourceTarget(PCollection<T>) - Method in class org.apache.crunch.impl.dist.DistributedPipeline
Retrieve a ReadableSourceTarget that provides access to the contents of a PCollection.
getMultiPaths() - Method in class org.apache.crunch.impl.mr.plan.MSCROutputHandler
 
getName() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
getName() - Method in class org.apache.crunch.impl.dist.DistributedPipeline
 
getName() - Method in class org.apache.crunch.impl.mem.collect.MemCollection
 
getName() - Method in class org.apache.crunch.impl.mem.MemPipeline
 
getName() - Method in class org.apache.crunch.impl.mr.plan.DoNode
 
getName() - Method in class org.apache.crunch.io.FormatBundle
 
getName() - Method in interface org.apache.crunch.PCollection
Returns a shorthand name for this PCollection.
getName() - Method in interface org.apache.crunch.Pipeline
Returns the name of this pipeline.
getNextAnonymousStageId() - Method in class org.apache.crunch.impl.dist.DistributedPipeline
 
getNodeContext() - Method in class org.apache.crunch.impl.mr.run.CrunchCombiner
 
getNodeContext() - Method in class org.apache.crunch.impl.mr.run.CrunchReducer
 
getNumReducers() - Method in class org.apache.crunch.GroupingOptions
 
getNumShards(K) - Method in interface org.apache.crunch.lib.join.ShardedJoinStrategy.ShardingStrategy
Retrieve the number of shards over which the given key should be split.
getOnlyParent() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
getOutput() - Method in class org.apache.crunch.impl.mem.emit.InMemoryEmitter
 
getOutputMapFn() - Method in class org.apache.crunch.types.avro.AvroType
 
getOutputMapFn() - Method in interface org.apache.crunch.types.PType
 
getOutputMapFn() - Method in class org.apache.crunch.types.writable.WritableType
 
getParallelDoOptions() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
getParents() - Method in class org.apache.crunch.impl.dist.collect.BaseDoCollection
 
getParents() - Method in class org.apache.crunch.impl.dist.collect.BaseDoTable
 
getParents() - Method in class org.apache.crunch.impl.dist.collect.BaseGroupedTable
 
getParents() - Method in class org.apache.crunch.impl.dist.collect.BaseInputCollection
 
getParents() - Method in class org.apache.crunch.impl.dist.collect.BaseInputTable
 
getParents() - Method in class org.apache.crunch.impl.dist.collect.BaseUnionCollection
 
getParents() - Method in class org.apache.crunch.impl.dist.collect.BaseUnionTable
 
getParents() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
getPartition(Object) - Method in class org.apache.crunch.impl.spark.SparkPartitioner
 
getPartition(Object, Object, int) - Method in class org.apache.crunch.lib.join.JoinUtils.AvroIndexedRecordPartitioner
 
getPartition(TupleWritable, Writable, int) - Method in class org.apache.crunch.lib.join.JoinUtils.TupleWritablePartitioner
 
getPartition(K, V, int) - Method in class org.apache.crunch.lib.sort.TotalOrderPartitioner
 
getPartitionerClass() - Method in class org.apache.crunch.GroupingOptions
 
getPartitionerClass(PTypeFamily) - Static method in class org.apache.crunch.lib.join.JoinUtils
 
getPartitionFile(Configuration) - Static method in class org.apache.crunch.lib.sort.TotalOrderPartitioner
 
getPath() - Method in class org.apache.crunch.io.impl.FileSourceImpl
Deprecated. 
getPath() - Method in class org.apache.crunch.io.impl.FileTargetImpl
 
getPath() - Method in class org.apache.crunch.io.impl.SourcePathTargetImpl
 
getPath() - Method in interface org.apache.crunch.io.PathTarget
 
getPath() - Method in class org.apache.crunch.io.text.TextFileTarget
 
getPath() - Method in class org.apache.crunch.materialize.MaterializableIterable
 
getPath(String) - Method in class org.apache.crunch.test.TemporaryPath
Get a Path below the temporary directory.
getPaths() - Method in class org.apache.crunch.io.impl.FileSourceImpl
 
getPathSize(Configuration, Path) - Static method in class org.apache.crunch.io.SourceTargetHelper
 
getPathSize(FileSystem, Path) - Static method in class org.apache.crunch.io.SourceTargetHelper
 
getPathToCacheFile(Path, Configuration) - Static method in class org.apache.crunch.util.DistCache
 
getPipeline() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
getPipeline() - Method in class org.apache.crunch.impl.mem.collect.MemCollection
 
getPipeline() - Method in interface org.apache.crunch.PCollection
Returns the Pipeline associated with this PCollection.
getPipeline() - Method in class org.apache.crunch.util.CrunchTool
 
getPlanDotFile() - Method in class org.apache.crunch.impl.mr.exec.MRExecutor
 
getPlanDotFile() - Method in class org.apache.crunch.impl.spark.SparkRuntime
 
getPlanDotFile() - Method in interface org.apache.crunch.PipelineExecution
Returns the .dot file that allows a client to graph the Crunch execution plan for this pipeline.
getPrimitiveType(Class<T>) - Static method in class org.apache.crunch.types.avro.Avros
 
getPrimitiveType(Class<T>) - Static method in class org.apache.crunch.types.writable.Writables
 
getProjectedSchema() - Method in class org.apache.crunch.io.parquet.AvroParquetFileSource
 
getPTableType() - Method in class org.apache.crunch.impl.dist.collect.BaseDoTable
 
getPTableType() - Method in class org.apache.crunch.impl.dist.collect.BaseInputTable
 
getPTableType() - Method in class org.apache.crunch.impl.dist.collect.BaseUnionTable
 
getPTableType() - Method in class org.apache.crunch.impl.mem.collect.MemTable
 
getPTableType() - Method in interface org.apache.crunch.PTable
Returns the PTableType of this PTable.
getPType(PTypeFamily) - Method in interface org.apache.crunch.contrib.text.Extractor
Returns the PType associated with this data type for the given PTypeFamily.
getPType() - Method in class org.apache.crunch.impl.dist.collect.BaseDoCollection
 
getPType() - Method in class org.apache.crunch.impl.dist.collect.BaseDoTable
 
getPType() - Method in class org.apache.crunch.impl.dist.collect.BaseGroupedTable
 
getPType() - Method in class org.apache.crunch.impl.dist.collect.BaseInputCollection
 
getPType() - Method in class org.apache.crunch.impl.dist.collect.BaseInputTable
 
getPType() - Method in class org.apache.crunch.impl.dist.collect.BaseUnionCollection
 
getPType() - Method in class org.apache.crunch.impl.dist.collect.BaseUnionTable
 
getPType() - Method in class org.apache.crunch.impl.mem.collect.MemCollection
 
getPType() - Method in class org.apache.crunch.impl.mr.plan.DoNode
 
getPType() - Method in interface org.apache.crunch.PCollection
Returns the PType of this PCollection.
getReadableDataInternal() - Method in class org.apache.crunch.impl.dist.collect.BaseDoCollection
 
getReadableDataInternal() - Method in class org.apache.crunch.impl.dist.collect.BaseDoTable
 
getReadableDataInternal() - Method in class org.apache.crunch.impl.dist.collect.BaseGroupedTable
 
getReadableDataInternal() - Method in class org.apache.crunch.impl.dist.collect.BaseInputCollection
 
getReadableDataInternal() - Method in class org.apache.crunch.impl.dist.collect.BaseInputTable
 
getReadableDataInternal() - Method in class org.apache.crunch.impl.dist.collect.BaseUnionCollection
 
getReadableDataInternal() - Method in class org.apache.crunch.impl.dist.collect.BaseUnionTable
 
getReadableDataInternal() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
getReadableDataInternal() - Method in class org.apache.crunch.impl.mr.collect.InputCollection
 
getReader(Schema) - Method in enum org.apache.crunch.types.avro.AvroMode
 
getReader(Schema) - Method in interface org.apache.crunch.types.avro.ReaderWriterFactory
 
getReader(Schema) - Method in class org.apache.crunch.types.avro.ReflectDataFactory
 
getRecommendedPartitions(PCollection<T>) - Static method in class org.apache.crunch.util.PartitionUtils
 
getRecommendedPartitions(PCollection<T>, Configuration) - Static method in class org.apache.crunch.util.PartitionUtils
 
getRecordWriter(TaskAttemptContext) - Method in class org.apache.crunch.io.avro.trevni.TrevniOutputFormat
 
getRecordWriter(TaskAttemptContext) - Method in class org.apache.crunch.io.hbase.HFileOutputFormatForCrunch
 
getRecordWriter(TaskAttemptContext) - Method in class org.apache.crunch.types.avro.AvroOutputFormat
 
getRecordWriter(TaskAttemptContext) - Method in class org.apache.crunch.types.avro.AvroPathPerKeyOutputFormat
 
getRecordWriter(TaskAttemptContext) - Method in class org.apache.crunch.types.avro.AvroTextOutputFormat
 
getReduceOutputName(Configuration, Path, int) - Method in interface org.apache.crunch.io.FileNamingScheme
Get the output file name for a reduce task.
getReduceOutputName(Configuration, Path, int) - Method in class org.apache.crunch.io.SequentialFileNamingScheme
 
getReflectData() - Method in class org.apache.crunch.types.avro.ReflectDataFactory
 
getReflectDataFactory(Configuration) - Static method in class org.apache.crunch.types.avro.Avros
Deprecated. as of 0.9.0; use AvroMode.fromConfiguration(conf)
getResult() - Method in class org.apache.crunch.impl.mr.exec.MRExecutor
 
getResult() - Method in class org.apache.crunch.impl.spark.SparkRuntime
 
getResult() - Method in interface org.apache.crunch.PipelineExecution
Retrieve the result of a pipeline if it has been completed, otherwise null.
getRootFile() - Method in class org.apache.crunch.test.TemporaryPath
Get the root directory which will be deleted automatically.
getRootFileName() - Method in class org.apache.crunch.test.TemporaryPath
Get the root directory as an absolute file name.
getRootPath() - Method in class org.apache.crunch.test.TemporaryPath
Get the root directory as a Path.
getRuntimeContext() - Method in class org.apache.crunch.impl.spark.SparkRuntime
 
getSchema() - Method in class org.apache.crunch.types.avro.AvroType
 
getSecond() - Method in class org.apache.crunch.fn.CompositeMapFn
 
getSerializationClass() - Method in class org.apache.crunch.types.writable.WritableType
 
getSize(Configuration) - Method in class org.apache.crunch.contrib.io.jdbc.DataBaseSource
 
getSize() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
getSize() - Method in class org.apache.crunch.impl.mem.collect.MemCollection
 
getSize(Configuration) - Method in class org.apache.crunch.io.hbase.HBaseSourceTarget
 
getSize(Configuration) - Method in class org.apache.crunch.io.hbase.HFileSource
 
getSize(Configuration) - Method in class org.apache.crunch.io.impl.FileSourceImpl
 
getSize() - Method in interface org.apache.crunch.PCollection
Returns the size of the data represented by this PCollection in bytes.
getSize(Configuration) - Method in interface org.apache.crunch.Source
Returns the number of bytes in this Source.
getSizeInternal() - Method in class org.apache.crunch.impl.dist.collect.BaseDoCollection
 
getSizeInternal() - Method in class org.apache.crunch.impl.dist.collect.BaseDoTable
 
getSizeInternal() - Method in class org.apache.crunch.impl.dist.collect.BaseGroupedTable
 
getSizeInternal() - Method in class org.apache.crunch.impl.dist.collect.BaseInputCollection
 
getSizeInternal() - Method in class org.apache.crunch.impl.dist.collect.BaseInputTable
 
getSizeInternal() - Method in class org.apache.crunch.impl.dist.collect.BaseUnionCollection
 
getSizeInternal() - Method in class org.apache.crunch.impl.dist.collect.BaseUnionTable
 
getSizeInternal() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
getSortComparatorClass() - Method in class org.apache.crunch.GroupingOptions
 
getSource() - Method in class org.apache.crunch.impl.dist.collect.BaseInputCollection
 
getSource() - Method in class org.apache.crunch.impl.dist.collect.BaseInputTable
 
getSource() - Method in class org.apache.crunch.impl.mr.plan.DoNode
 
getSource() - Method in class org.apache.crunch.materialize.MaterializableIterable
 
getSourcePattern(Path, int) - Method in class org.apache.crunch.io.avro.trevni.TrevniKeyTarget
 
getSourcePattern(Path, int) - Method in class org.apache.crunch.io.impl.FileTargetImpl
 
getSourceTargets() - Method in class org.apache.crunch.GroupingOptions
 
getSourceTargets() - Method in class org.apache.crunch.io.hbase.HBaseData
 
getSourceTargets() - Method in class org.apache.crunch.io.impl.ReadableDataImpl
 
getSourceTargets() - Method in class org.apache.crunch.ParallelDoOptions
 
getSourceTargets() - Method in interface org.apache.crunch.ReadableData
 
getSourceTargets() - Method in class org.apache.crunch.util.DelegatingReadableData
 
getSourceTargets() - Method in class org.apache.crunch.util.UnionReadableData
 
getSparkContext() - Method in class org.apache.crunch.impl.spark.SparkRuntime
 
getSplits(JobContext) - Method in class org.apache.crunch.impl.mr.run.CrunchInputFormat
 
getStageId() - Method in class org.apache.crunch.PipelineResult.StageResult
 
getStageName() - Method in class org.apache.crunch.PipelineResult.StageResult
 
getStageResults() - Method in class org.apache.crunch.PipelineResult
 
getStats() - Method in class org.apache.crunch.contrib.text.AbstractCompositeExtractor
 
getStats() - Method in class org.apache.crunch.contrib.text.AbstractSimpleExtractor
 
getStats() - Method in interface org.apache.crunch.contrib.text.Extractor
Return statistics about how many errors this Extractor instance encountered while parsing input data.
getStatus() - Method in class org.apache.crunch.DoFn
 
getStatus() - Method in class org.apache.crunch.impl.mr.exec.MRExecutor
 
getStatus() - Method in class org.apache.crunch.impl.spark.SparkRuntime
 
getStatus() - Method in interface org.apache.crunch.PipelineExecution
 
getStorageLevel(PCollection<?>) - Method in class org.apache.crunch.impl.spark.SparkRuntime
 
getSubTypes() - Method in class org.apache.crunch.types.avro.AvroType
 
getSubTypes() - Method in class org.apache.crunch.types.PGroupedTableType
 
getSubTypes() - Method in interface org.apache.crunch.types.PType
Returns the sub-types that make up this PType if it is a composite instance, such as a tuple.
getSubTypes() - Method in class org.apache.crunch.types.writable.WritableType
 
getSuccessIndicator() - Method in class org.apache.crunch.io.impl.FileTargetImpl
 
getTableType() - Method in class org.apache.crunch.io.hbase.HBaseSourceTarget
 
getTableType() - Method in class org.apache.crunch.io.impl.FileTableSourceImpl
 
getTableType() - Method in class org.apache.crunch.io.impl.TableSourcePathTargetImpl
 
getTableType() - Method in class org.apache.crunch.io.impl.TableSourceTargetImpl
 
getTableType() - Method in class org.apache.crunch.io.seq.SeqFileTableSourceTarget
 
getTableType() - Method in class org.apache.crunch.io.text.TextFileTableSourceTarget
 
getTableType() - Method in interface org.apache.crunch.TableSource
 
getTableType() - Method in class org.apache.crunch.types.PGroupedTableType
 
getTargetDependencies() - Method in class org.apache.crunch.impl.dist.collect.BaseGroupedTable
 
getTargetDependencies() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
getTaskAttemptID() - Method in class org.apache.crunch.DoFn
 
getTaskIOContext(Broadcast<Configuration>, Accumulator<Map<String, Long>>) - Static method in class org.apache.crunch.impl.spark.SparkRuntimeContext
 
getTestContext(Configuration) - Static method in class org.apache.crunch.test.CrunchTestSupport
The method creates a TaskInputOutputContext which can be used in unit tests.
getTupleFactory(Class<T>) - Static method in class org.apache.crunch.types.TupleFactory
Get the TupleFactory for a given Tuple implementation.
getType() - Method in class org.apache.crunch.io.hbase.HBaseSourceTarget
 
getType() - Method in class org.apache.crunch.io.impl.FileSourceImpl
 
getType() - Method in interface org.apache.crunch.Source
Returns the PType for this source.
getTypeClass() - Method in class org.apache.crunch.types.avro.AvroType
 
getTypeClass() - Method in interface org.apache.crunch.types.PType
Returns the Java type represented by this PType.
getTypeClass() - Method in class org.apache.crunch.types.writable.WritableType
 
getTypeFamily() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
getTypeFamily() - Method in class org.apache.crunch.impl.mem.collect.MemCollection
 
getTypeFamily() - Method in interface org.apache.crunch.PCollection
Returns the PTypeFamily of this PCollection.
getValue() - Method in class org.apache.crunch.materialize.pobject.PObjectImpl
Gets the value associated with this PObject.
getValue() - Method in interface org.apache.crunch.PObject
Gets the value associated with this PObject.
getValueClass() - Method in class org.apache.crunch.io.hbase.HBaseValueConverter
 
getValueClass() - Method in interface org.apache.crunch.types.Converter
 
getValueType() - Method in class org.apache.crunch.impl.dist.collect.PTableBase
 
getValueType() - Method in class org.apache.crunch.impl.mem.collect.MemTable
 
getValueType() - Method in interface org.apache.crunch.PTable
Returns the PType of the value.
getValueType() - Method in interface org.apache.crunch.types.PTableType
Returns the value type for the table.
getWriter(Schema) - Method in enum org.apache.crunch.types.avro.AvroMode
 
getWriter(Schema) - Method in interface org.apache.crunch.types.avro.ReaderWriterFactory
 
getWriter(Schema) - Method in class org.apache.crunch.types.avro.ReflectDataFactory
 
groupByKey() - Method in class org.apache.crunch.impl.dist.collect.PTableBase
 
groupByKey(int) - Method in class org.apache.crunch.impl.dist.collect.PTableBase
 
groupByKey(GroupingOptions) - Method in class org.apache.crunch.impl.dist.collect.PTableBase
 
groupByKey() - Method in class org.apache.crunch.impl.mem.collect.MemTable
 
groupByKey(int) - Method in class org.apache.crunch.impl.mem.collect.MemTable
 
groupByKey(GroupingOptions) - Method in class org.apache.crunch.impl.mem.collect.MemTable
 
groupByKey() - Method in interface org.apache.crunch.PTable
Performs a grouping operation on the keys of this table.
groupByKey(int) - Method in interface org.apache.crunch.PTable
Performs a grouping operation on the keys of this table, using the given number of partitions.
groupByKey(GroupingOptions) - Method in interface org.apache.crunch.PTable
Performs a grouping operation on the keys of this table, using the additional GroupingOptions to control how the grouping is executed.
groupedWeightedReservoirSample(PTable<Integer, Pair<T, N>>, int[]) - Static method in class org.apache.crunch.lib.Sample
The most general purpose of the weighted reservoir sampling patterns that allows us to choose a random sample of elements for each of N input groups.
groupedWeightedReservoirSample(PTable<Integer, Pair<T, N>>, int[], Long) - Static method in class org.apache.crunch.lib.Sample
Same as the other groupedWeightedReservoirSample method, but include a seed for testing purposes.
groupingComparatorClass(Class<? extends RawComparator>) - Method in class org.apache.crunch.GroupingOptions.Builder
 
GroupingOptions - Class in org.apache.crunch
Options that can be passed to a groupByKey operation in order to exercise finer control over how the partitioning, grouping, and sorting of keys is performed.
groupingOptions - Variable in class org.apache.crunch.impl.dist.collect.BaseGroupedTable
 
GroupingOptions.Builder - Class in org.apache.crunch
Builder class for creating GroupingOptions instances.
GroupingOptions.Builder() - Constructor for class org.apache.crunch.GroupingOptions.Builder
 
GuavaUtils - Class in org.apache.crunch.impl.spark
 
GuavaUtils() - Constructor for class org.apache.crunch.impl.spark.GuavaUtils
 

H

handleExisting(Target.WriteMode, long, Configuration) - Method in class org.apache.crunch.io.hbase.HBaseTarget
 
handleExisting(Target.WriteMode, long, Configuration) - Method in class org.apache.crunch.io.impl.FileTargetImpl
 
handleExisting(Target.WriteMode, long, Configuration) - Method in interface org.apache.crunch.Target
Apply the given WriteMode to this Target instance.
handleOutputs(Configuration, Path, int) - Method in class org.apache.crunch.io.avro.AvroPathPerKeyTarget
 
handleOutputs(Configuration, Path, int) - Method in class org.apache.crunch.io.impl.FileTargetImpl
 
handleOutputs(Configuration, Path, int) - Method in class org.apache.crunch.io.impl.SourcePathTargetImpl
 
handleOutputs(Configuration, Path, int) - Method in interface org.apache.crunch.io.PathTarget
Handles moving the output data for this target from a temporary location on the filesystem to its target path at the end of a MapReduce job.
has(int) - Method in class org.apache.crunch.types.writable.TupleWritable
Return true if tuple has an element at the position provided.
hasCombineFn() - Method in class org.apache.crunch.impl.mr.collect.DoTable
 
hashCode() - Method in class org.apache.crunch.impl.dist.collect.BaseInputCollection
 
hashCode() - Method in class org.apache.crunch.impl.dist.collect.BaseInputTable
 
hashCode() - Method in class org.apache.crunch.impl.mr.plan.DoNode
 
hashCode() - Method in class org.apache.crunch.impl.spark.ByteArray
 
hashCode() - Method in class org.apache.crunch.impl.spark.IntByteArray
 
hashCode() - Method in class org.apache.crunch.io.FormatBundle
 
hashCode() - Method in class org.apache.crunch.io.hbase.HBaseSourceTarget
 
hashCode() - Method in class org.apache.crunch.io.hbase.HBaseTarget
 
hashCode() - Method in class org.apache.crunch.io.impl.FileSourceImpl
 
hashCode() - Method in class org.apache.crunch.io.impl.FileTargetImpl
 
hashCode() - Method in class org.apache.crunch.Pair
 
hashCode() - Method in class org.apache.crunch.Tuple3
 
hashCode() - Method in class org.apache.crunch.Tuple4
 
hashCode() - Method in class org.apache.crunch.TupleN
 
hashCode() - Method in class org.apache.crunch.types.avro.AvroType
 
hashCode() - Method in class org.apache.crunch.types.writable.TupleWritable
 
hashCode() - Method in class org.apache.crunch.types.writable.WritableType
 
hasNext() - Method in class org.apache.crunch.contrib.text.Tokenizer
Returns true if the underlying Scanner has any tokens remaining.
hasNext() - Method in class org.apache.crunch.io.impl.AutoClosingIterator
 
hasNext() - Method in class org.apache.crunch.types.PGroupedTableType.HoldLastIterator
 
hasNext() - Method in class org.apache.crunch.util.DoFnIterator
 
hasReflect() - Method in class org.apache.crunch.types.avro.AvroType
Determine if the wrapped type is a reflection-based avro type or wraps one.
hasSpecific() - Method in class org.apache.crunch.types.avro.AvroType
Determine if the wrapped type is a specific data avro type or wraps one.
HBaseData - Class in org.apache.crunch.io.hbase
 
HBaseData(String, String, SourceTarget<?>) - Constructor for class org.apache.crunch.io.hbase.HBaseData
 
HBaseSourceTarget - Class in org.apache.crunch.io.hbase
 
HBaseSourceTarget(String, Scan) - Constructor for class org.apache.crunch.io.hbase.HBaseSourceTarget
 
HBaseTarget - Class in org.apache.crunch.io.hbase
 
HBaseTarget(String) - Constructor for class org.apache.crunch.io.hbase.HBaseTarget
 
HBaseTypes - Class in org.apache.crunch.io.hbase
 
HBaseValueConverter<V> - Class in org.apache.crunch.io.hbase
 
HBaseValueConverter(Class<V>) - Constructor for class org.apache.crunch.io.hbase.HBaseValueConverter
 
HCOLUMN_DESCRIPTOR_KEY - Static variable in class org.apache.crunch.io.hbase.HFileOutputFormatForCrunch
 
hfile(String) - Static method in class org.apache.crunch.io.hbase.FromHBase
 
hfile(Path) - Static method in class org.apache.crunch.io.hbase.FromHBase
 
hfile(String) - Static method in class org.apache.crunch.io.hbase.ToHBase
 
hfile(Path) - Static method in class org.apache.crunch.io.hbase.ToHBase
 
HFILE_SCANNER_CACHE_BLOCKS - Static variable in class org.apache.crunch.io.hbase.HFileReaderFactory
 
HFILE_SCANNER_PREAD - Static variable in class org.apache.crunch.io.hbase.HFileReaderFactory
 
HFileInputFormat - Class in org.apache.crunch.io.hbase
Simple input format for HFiles.
HFileInputFormat() - Constructor for class org.apache.crunch.io.hbase.HFileInputFormat
 
HFileOutputFormatForCrunch - Class in org.apache.crunch.io.hbase
This is a thin wrapper of HFile.Writer.
HFileOutputFormatForCrunch() - Constructor for class org.apache.crunch.io.hbase.HFileOutputFormatForCrunch
 
HFileReadableData - Class in org.apache.crunch.io.hbase
 
HFileReadableData(List<Path>) - Constructor for class org.apache.crunch.io.hbase.HFileReadableData
 
HFileReaderFactory - Class in org.apache.crunch.io.hbase
 
HFileReaderFactory() - Constructor for class org.apache.crunch.io.hbase.HFileReaderFactory
 
HFileSource - Class in org.apache.crunch.io.hbase
 
HFileSource(Path) - Constructor for class org.apache.crunch.io.hbase.HFileSource
 
HFileSource(List<Path>) - Constructor for class org.apache.crunch.io.hbase.HFileSource
 
HFileTarget - Class in org.apache.crunch.io.hbase
 
HFileTarget(String) - Constructor for class org.apache.crunch.io.hbase.HFileTarget
 
HFileTarget(Path) - Constructor for class org.apache.crunch.io.hbase.HFileTarget
 
HFileTarget(Path, HColumnDescriptor) - Constructor for class org.apache.crunch.io.hbase.HFileTarget
 
HFileUtils - Class in org.apache.crunch.io.hbase
 
HFileUtils() - Constructor for class org.apache.crunch.io.hbase.HFileUtils
 
HFileUtils.KeyValueComparator - Class in org.apache.crunch.io.hbase
 
HFileUtils.KeyValueComparator() - Constructor for class org.apache.crunch.io.hbase.HFileUtils.KeyValueComparator
 

I

id - Variable in class org.apache.crunch.contrib.io.jdbc.IdentifiableName
 
IdentifiableName - Class in org.apache.crunch.contrib.io.jdbc
 
IdentifiableName() - Constructor for class org.apache.crunch.contrib.io.jdbc.IdentifiableName
 
IdentityFn<T> - Class in org.apache.crunch.fn
 
includeField(String) - Method in class org.apache.crunch.io.parquet.AvroParquetFileSource.Builder
 
increment(String, String) - Method in class org.apache.crunch.DoFn
 
increment(String, String, long) - Method in class org.apache.crunch.DoFn
 
increment(Enum<?>) - Method in class org.apache.crunch.DoFn
 
increment(Enum<?>, long) - Method in class org.apache.crunch.DoFn
 
initialize(Configuration) - Method in interface org.apache.crunch.Aggregator
Perform any setup of this instance that is required prior to processing inputs.
initialize() - Method in class org.apache.crunch.contrib.bloomfilter.BloomFilterFn
 
initialize() - Method in class org.apache.crunch.contrib.text.AbstractCompositeExtractor
 
initialize() - Method in class org.apache.crunch.contrib.text.AbstractSimpleExtractor
 
initialize() - Method in interface org.apache.crunch.contrib.text.Extractor
Perform any initialization required by this Extractor during the start of a map or reduce task.
initialize() - Method in class org.apache.crunch.DoFn
Initialize this DoFn.
initialize(Configuration) - Method in class org.apache.crunch.fn.Aggregators.SimpleAggregator
 
initialize() - Method in class org.apache.crunch.fn.CompositeMapFn
 
initialize() - Method in class org.apache.crunch.fn.ExtractKeyFn
 
initialize() - Method in class org.apache.crunch.fn.PairMapFn
 
initialize(CrunchTaskContext) - Method in class org.apache.crunch.impl.mr.run.RTNode
 
initialize(DoFn<?, ?>) - Method in class org.apache.crunch.impl.spark.SparkRuntimeContext
 
initialize() - Method in class org.apache.crunch.lib.Aggregate.TopKFn
 
initialize() - Method in class org.apache.crunch.lib.join.FullOuterJoinFn
Initialize this DoFn.
initialize() - Method in class org.apache.crunch.lib.join.InnerJoinFn
 
initialize() - Method in class org.apache.crunch.lib.join.JoinFn
 
initialize() - Method in class org.apache.crunch.lib.join.LeftOuterJoinFn
Initialize this DoFn.
initialize() - Method in class org.apache.crunch.lib.join.RightOuterJoinFn
Initialize this DoFn.
initialize() - Method in class org.apache.crunch.lib.sort.SortFns.AvroGenericFn
 
initialize(Configuration) - Method in class org.apache.crunch.types.avro.AvroType
 
initialize(Configuration) - Method in class org.apache.crunch.types.CollectionDeepCopier
 
initialize(Configuration) - Method in interface org.apache.crunch.types.DeepCopier
Initialize the deep copier with a job-specific configuration
initialize(Configuration) - Method in class org.apache.crunch.types.DeepCopier.NoOpDeepCopier
 
initialize(Configuration) - Method in class org.apache.crunch.types.MapDeepCopier
 
initialize() - Method in class org.apache.crunch.types.PGroupedTableType.PairIterableMapFn
 
initialize(Configuration) - Method in interface org.apache.crunch.types.PType
Initialize this PType for use within a DoFn.
initialize(Configuration) - Method in class org.apache.crunch.types.TupleDeepCopier
 
initialize() - Method in class org.apache.crunch.types.TupleFactory
 
initialize(Configuration) - Method in class org.apache.crunch.types.writable.WritableDeepCopier
 
initialize(Configuration) - Method in class org.apache.crunch.types.writable.WritableType
 
initSchema(TaskAttemptContext) - Method in class org.apache.crunch.io.avro.trevni.TrevniRecordWriter
InMemoryEmitter<T> - Class in org.apache.crunch.impl.mem.emit
An Emitter instance that writes emitted records to a backing List.
InMemoryEmitter() - Constructor for class org.apache.crunch.impl.mem.emit.InMemoryEmitter
 
InMemoryEmitter(List<T>) - Constructor for class org.apache.crunch.impl.mem.emit.InMemoryEmitter
 
innerJoin(PTable<K, U>, PTable<K, V>) - Static method in class org.apache.crunch.lib.Join
Performs an inner join on the specified PTables.
InnerJoinFn<K,U,V> - Class in org.apache.crunch.lib.join
Used to perform the last step of an inner join.
InnerJoinFn(PType<K>, PType<U>) - Constructor for class org.apache.crunch.lib.join.InnerJoinFn
 
inputBundle - Variable in class org.apache.crunch.io.impl.FileSourceImpl
 
InputCollection<S> - Class in org.apache.crunch.impl.mr.collect
 
InputCollection(Source<S>, MRPipeline) - Constructor for class org.apache.crunch.impl.mr.collect.InputCollection
 
InputCollection<S> - Class in org.apache.crunch.impl.spark.collect
 
inputConf(String, String) - Method in class org.apache.crunch.io.hbase.HBaseSourceTarget
 
inputConf(String, String) - Method in class org.apache.crunch.io.impl.FileSourceImpl
 
inputConf(String, String) - Method in interface org.apache.crunch.Source
Adds the given key-value pair to the Configuration instance that is used to read this Source<T></T>.
InputConverterFunction<K,V,S> - Class in org.apache.crunch.impl.spark.fn
 
InputConverterFunction(Converter<K, V, S, ?>) - Constructor for class org.apache.crunch.impl.spark.fn.InputConverterFunction
 
InputTable<K,V> - Class in org.apache.crunch.impl.mr.collect
 
InputTable(TableSource<K, V>, MRPipeline) - Constructor for class org.apache.crunch.impl.mr.collect.InputTable
 
InputTable<K,V> - Class in org.apache.crunch.impl.spark.collect
 
InputTable(TableSource<K, V>, DistributedPipeline) - Constructor for class org.apache.crunch.impl.spark.collect.InputTable
 
IntByteArray - Class in org.apache.crunch.impl.spark
 
IntByteArray(int, byte[]) - Constructor for class org.apache.crunch.impl.spark.IntByteArray
 
IntermediateEmitter - Class in org.apache.crunch.impl.mr.emit
An Emitter implementation that links the output of one DoFn to the input of another DoFn.
IntermediateEmitter(PType<Object>, List<RTNode>, Configuration, boolean) - Constructor for class org.apache.crunch.impl.mr.emit.IntermediateEmitter
 
interruptTask() - Method in class org.apache.crunch.impl.mr.exec.MRExecutor
 
intersection(PCollection<T>, PCollection<T>) - Static method in class org.apache.crunch.lib.Set
Compute the intersection of two sets of elements.
ints() - Static method in class org.apache.crunch.types.avro.Avros
 
ints() - Method in class org.apache.crunch.types.avro.AvroTypeFamily
 
ints() - Method in interface org.apache.crunch.types.PTypeFamily
 
ints() - Static method in class org.apache.crunch.types.writable.Writables
 
ints() - Method in class org.apache.crunch.types.writable.WritableTypeFamily
 
isBreakpoint() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
isCompatible(FileSystem, Path) - Static method in class org.apache.crunch.io.impl.FileTargetImpl
 
isCompatibleWith(GroupingOptions) - Method in class org.apache.crunch.GroupingOptions
 
isGeneric() - Method in class org.apache.crunch.types.avro.AvroType
Determine if the wrapped type is a generic data avro type.
isLeafNode() - Method in class org.apache.crunch.impl.mr.run.RTNode
 
isMapOnlyJob() - Method in class org.apache.crunch.impl.mr.plan.MSCROutputHandler
 
isOutputNode() - Method in class org.apache.crunch.impl.mr.plan.DoNode
 
isSourceTarget() - Method in class org.apache.crunch.materialize.MaterializableIterable
 
isSplitable(JobContext, Path) - Method in class org.apache.crunch.io.hbase.HFileInputFormat
 
isSplitable(FileSystem, Path) - Method in class org.apache.crunch.types.avro.AvroUtf8InputFormat
 
isValid(JavaRDDLike<?, ?>) - Method in class org.apache.crunch.impl.spark.SparkRuntime
 
iterator() - Method in class org.apache.crunch.impl.SingleUseIterable
 
iterator() - Method in class org.apache.crunch.impl.spark.fn.CrunchIterable
 
iterator() - Method in class org.apache.crunch.io.CompositePathIterable
 
iterator() - Method in class org.apache.crunch.materialize.MaterializableIterable
 
iterator() - Method in class org.apache.crunch.types.PGroupedTableType.PTypeIterable
 
iterator() - Method in class org.apache.crunch.util.Tuples.PairIterable
 
iterator() - Method in class org.apache.crunch.util.Tuples.QuadIterable
 
iterator() - Method in class org.apache.crunch.util.Tuples.TripIterable
 
iterator() - Method in class org.apache.crunch.util.Tuples.TupleNIterable
 

J

JOB_NAME_MAX_STACK_LENGTH - Static variable in class org.apache.crunch.impl.mr.plan.PlanningParameters
 
join(PTable<K, U>) - Method in class org.apache.crunch.impl.dist.collect.PTableBase
 
join(PTable<K, U>) - Method in class org.apache.crunch.impl.mem.collect.MemTable
 
Join - Class in org.apache.crunch.lib
Utilities for joining multiple PTable instances based on a common lastKey.
Join() - Constructor for class org.apache.crunch.lib.Join
 
join(PTable<K, U>, PTable<K, V>, JoinType) - Method in class org.apache.crunch.lib.join.BloomFilterJoinStrategy
 
join(PTable<K, U>, PTable<K, V>, JoinType) - Method in class org.apache.crunch.lib.join.DefaultJoinStrategy
 
join(PTable<K, U>, PTable<K, V>, JoinFn<K, U, V>) - Method in class org.apache.crunch.lib.join.DefaultJoinStrategy
Perform a default join on the given PTable instances using a user-specified JoinFn.
join(K, int, Iterable<Pair<U, V>>, Emitter<Pair<K, Pair<U, V>>>) - Method in class org.apache.crunch.lib.join.FullOuterJoinFn
Performs the actual joining.
join(K, int, Iterable<Pair<U, V>>, Emitter<Pair<K, Pair<U, V>>>) - Method in class org.apache.crunch.lib.join.InnerJoinFn
 
join(PTable<K, U>, PTable<K, V>) - Static method in class org.apache.crunch.lib.Join
Performs an inner join on the specified PTables.
join(K, int, Iterable<Pair<U, V>>, Emitter<Pair<K, Pair<U, V>>>) - Method in class org.apache.crunch.lib.join.JoinFn
Performs the actual joining.
join(PTable<K, U>, PTable<K, V>, JoinType) - Method in interface org.apache.crunch.lib.join.JoinStrategy
Join two tables with the given join type.
join(K, int, Iterable<Pair<U, V>>, Emitter<Pair<K, Pair<U, V>>>) - Method in class org.apache.crunch.lib.join.LeftOuterJoinFn
Performs the actual joining.
join(PTable<K, U>, PTable<K, V>, JoinType) - Method in class org.apache.crunch.lib.join.MapsideJoinStrategy
 
join(K, int, Iterable<Pair<U, V>>, Emitter<Pair<K, Pair<U, V>>>) - Method in class org.apache.crunch.lib.join.RightOuterJoinFn
Performs the actual joining.
join(PTable<K, U>, PTable<K, V>, JoinType) - Method in class org.apache.crunch.lib.join.ShardedJoinStrategy
 
join(PTable<K, U>) - Method in interface org.apache.crunch.PTable
Perform an inner join on this table and the one passed in as an argument on their common keys.
JoinFn<K,U,V> - Class in org.apache.crunch.lib.join
Represents a DoFn for performing joins.
JoinFn(PType<K>, PType<U>) - Constructor for class org.apache.crunch.lib.join.JoinFn
Instantiate with the PType of the value of the left side of the join (used for creating deep copies of values).
JoinStrategy<K,U,V> - Interface in org.apache.crunch.lib.join
Defines a strategy for joining two PTables together on a common key.
JoinType - Enum in org.apache.crunch.lib.join
Specifies the specific behavior of how a join should be performed in terms of requiring matching keys on both sides of the join.
JoinUtils - Class in org.apache.crunch.lib.join
Utilities that are useful in joining multiple data sets via a MapReduce.
JoinUtils() - Constructor for class org.apache.crunch.lib.join.JoinUtils
 
JoinUtils.AvroIndexedRecordPartitioner - Class in org.apache.crunch.lib.join
 
JoinUtils.AvroIndexedRecordPartitioner() - Constructor for class org.apache.crunch.lib.join.JoinUtils.AvroIndexedRecordPartitioner
 
JoinUtils.AvroPairGroupingComparator<T> - Class in org.apache.crunch.lib.join
 
JoinUtils.AvroPairGroupingComparator() - Constructor for class org.apache.crunch.lib.join.JoinUtils.AvroPairGroupingComparator
 
JoinUtils.TupleWritableComparator - Class in org.apache.crunch.lib.join
 
JoinUtils.TupleWritableComparator() - Constructor for class org.apache.crunch.lib.join.JoinUtils.TupleWritableComparator
 
JoinUtils.TupleWritablePartitioner - Class in org.apache.crunch.lib.join
 
JoinUtils.TupleWritablePartitioner() - Constructor for class org.apache.crunch.lib.join.JoinUtils.TupleWritablePartitioner
 
jsons(Class<T>) - Static method in class org.apache.crunch.types.avro.Avros
 
jsons(Class<T>) - Static method in class org.apache.crunch.types.writable.Writables
 
jsonString(Class<T>, PTypeFamily) - Static method in class org.apache.crunch.types.PTypes
 

K

keep(Integer...) - Method in class org.apache.crunch.contrib.text.TokenizerFactory.Builder
Keep only the specified fields found by the input scanner, counting from zero.
keys() - Method in class org.apache.crunch.impl.dist.collect.PTableBase
 
keys() - Method in class org.apache.crunch.impl.mem.collect.MemTable
 
keys(PTable<K, V>) - Static method in class org.apache.crunch.lib.PTables
Extract the keys from the given PTable<K, V> as a PCollection<K>.
keys() - Method in interface org.apache.crunch.PTable
Returns a PCollection made up of the keys in this PTable.
keyType - Variable in class org.apache.crunch.lib.join.JoinFn
 
keyValues() - Static method in class org.apache.crunch.io.hbase.HBaseTypes
 
keyValueToBytes(KeyValue) - Static method in class org.apache.crunch.io.hbase.HBaseTypes
 
kill() - Method in class org.apache.crunch.impl.mr.exec.MRExecutor
 
kill() - Method in class org.apache.crunch.impl.spark.SparkRuntime
 
kill() - Method in interface org.apache.crunch.PipelineExecution
Kills the pipeline if it is running, no-op otherwise.

L

LAST_N(int) - Static method in class org.apache.crunch.fn.Aggregators
Return the last n values (or fewer if there are fewer values than n).
leftJoin(PTable<K, U>, PTable<K, V>) - Static method in class org.apache.crunch.lib.Join
Performs a left outer join on the specified PTables.
LeftOuterJoinFn<K,U,V> - Class in org.apache.crunch.lib.join
Used to perform the last step of an left outer join.
LeftOuterJoinFn(PType<K>, PType<U>) - Constructor for class org.apache.crunch.lib.join.LeftOuterJoinFn
 
leftValueType - Variable in class org.apache.crunch.lib.join.JoinFn
 
length() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
length() - Method in class org.apache.crunch.impl.mem.collect.MemCollection
 
length(PCollection<S>) - Static method in class org.apache.crunch.lib.Aggregate
Returns the number of elements in the provided PCollection.
length() - Method in interface org.apache.crunch.PCollection
Returns the number of elements represented by this PCollection.
lineParser(String, Class<M>) - Static method in class org.apache.crunch.types.Protos
 
listStatus(JobContext) - Method in class org.apache.crunch.io.hbase.HFileInputFormat
 
locale(Locale) - Method in class org.apache.crunch.contrib.text.TokenizerFactory.Builder
Sets the Locale to use with the TokenizerFactory returned by this Builder instance.
LOG_JOB_PROGRESS - Static variable in class org.apache.crunch.impl.mr.run.RuntimeParameters
 
longs() - Static method in class org.apache.crunch.types.avro.Avros
 
longs() - Method in class org.apache.crunch.types.avro.AvroTypeFamily
 
longs() - Method in interface org.apache.crunch.types.PTypeFamily
 
longs() - Static method in class org.apache.crunch.types.writable.Writables
 
longs() - Method in class org.apache.crunch.types.writable.WritableTypeFamily
 

M

main(String[]) - Static method in class org.apache.crunch.examples.AverageBytesByIP
 
main(String[]) - Static method in class org.apache.crunch.examples.SecondarySortExample
 
main(String[]) - Static method in class org.apache.crunch.examples.SortExample
 
main(String[]) - Static method in class org.apache.crunch.examples.TotalBytesByIP
 
main(String[]) - Static method in class org.apache.crunch.examples.WordAggregationHBase
 
main(String[]) - Static method in class org.apache.crunch.examples.WordCount
 
makeTuple(Object...) - Method in class org.apache.crunch.types.TupleFactory
 
map(R) - Method in class org.apache.crunch.fn.CompositeMapFn
 
map(V) - Method in class org.apache.crunch.fn.ExtractKeyFn
 
map(T) - Method in class org.apache.crunch.fn.IdentityFn
 
map(Pair<K, V>) - Method in class org.apache.crunch.fn.PairMapFn
 
map(Object, Object, Mapper<Object, Object, Object, Object>.Context) - Method in class org.apache.crunch.impl.mr.run.CrunchMapper
 
map(PTable<K1, V1>, Class<? extends Mapper<K1, V1, K2, V2>>, Class<K2>, Class<V2>) - Static method in class org.apache.crunch.lib.Mapred
 
map(PTable<K1, V1>, Class<? extends Mapper<K1, V1, K2, V2>>, Class<K2>, Class<V2>) - Static method in class org.apache.crunch.lib.Mapreduce
 
map(V) - Method in class org.apache.crunch.lib.sort.SortFns.AvroGenericFn
 
map(V) - Method in class org.apache.crunch.lib.sort.SortFns.SingleKeyFn
 
map(V) - Method in class org.apache.crunch.lib.sort.SortFns.TupleKeyFn
 
map(S) - Method in class org.apache.crunch.MapFn
Maps the given input into an instance of the output type.
map(Pair<Object, Iterable<Object>>) - Method in class org.apache.crunch.types.PGroupedTableType.PairIterableMapFn
 
MapDeepCopier<T> - Class in org.apache.crunch.types
 
MapDeepCopier(PType<T>) - Constructor for class org.apache.crunch.types.MapDeepCopier
 
MapFn<S,T> - Class in org.apache.crunch
A DoFn for the common case of emitting exactly one value for each input record.
MapFn() - Constructor for class org.apache.crunch.MapFn
 
MapFunction - Class in org.apache.crunch.impl.spark.fn
 
MapFunction(MapFn, SparkRuntimeContext) - Constructor for class org.apache.crunch.impl.spark.fn.MapFunction
 
mapKeys(MapFn<K, K2>, PType<K2>) - Method in class org.apache.crunch.impl.dist.collect.PTableBase
 
mapKeys(String, MapFn<K, K2>, PType<K2>) - Method in class org.apache.crunch.impl.dist.collect.PTableBase
 
mapKeys(MapFn<K, K2>, PType<K2>) - Method in class org.apache.crunch.impl.mem.collect.MemTable
 
mapKeys(String, MapFn<K, K2>, PType<K2>) - Method in class org.apache.crunch.impl.mem.collect.MemTable
 
mapKeys(PTable<K1, V>, MapFn<K1, K2>, PType<K2>) - Static method in class org.apache.crunch.lib.PTables
Maps a PTable<K1, V> to a PTable<K2, V> using the given MapFn<K1, K2> on the keys of the PTable.
mapKeys(String, PTable<K1, V>, MapFn<K1, K2>, PType<K2>) - Static method in class org.apache.crunch.lib.PTables
Maps a PTable<K1, V> to a PTable<K2, V> using the given MapFn<K1, K2> on the keys of the PTable.
mapKeys(MapFn<K, K2>, PType<K2>) - Method in interface org.apache.crunch.PTable
Returns a PTable that has the same values as this instance, but uses the given function to map the keys.
mapKeys(String, MapFn<K, K2>, PType<K2>) - Method in interface org.apache.crunch.PTable
Returns a PTable that has the same values as this instance, but uses the given function to map the keys.
MapOutputFunction<K,V> - Class in org.apache.crunch.impl.spark.fn
 
MapOutputFunction(SerDe, SerDe) - Constructor for class org.apache.crunch.impl.spark.fn.MapOutputFunction
 
MapPObject<K,V> - Class in org.apache.crunch.materialize.pobject
A concrete implementation of PObjectImpl whose value is a Java Map.
MapPObject(PCollection<Pair<K, V>>) - Constructor for class org.apache.crunch.materialize.pobject.MapPObject
Constructs a new instance of this PObject implementation.
Mapred - Class in org.apache.crunch.lib
Static functions for working with legacy Mappers and Reducers that live under the org.apache.hadoop.mapred.* package as part of Crunch pipelines.
Mapred() - Constructor for class org.apache.crunch.lib.Mapred
 
Mapreduce - Class in org.apache.crunch.lib
Static functions for working with legacy Mappers and Reducers that live under the org.apache.hadoop.mapreduce.* package as part of Crunch pipelines.
Mapreduce() - Constructor for class org.apache.crunch.lib.Mapreduce
 
MapReduceTarget - Interface in org.apache.crunch.io
 
maps(PType<T>) - Static method in class org.apache.crunch.types.avro.Avros
 
maps(PType<T>) - Method in class org.apache.crunch.types.avro.AvroTypeFamily
 
maps(PType<T>) - Method in interface org.apache.crunch.types.PTypeFamily
 
maps(PType<T>) - Static method in class org.apache.crunch.types.writable.Writables
 
maps(PType<T>) - Method in class org.apache.crunch.types.writable.WritableTypeFamily
 
MapsideJoinStrategy<K,U,V> - Class in org.apache.crunch.lib.join
Utility for doing map side joins on a common key between two PTables.
MapsideJoinStrategy() - Constructor for class org.apache.crunch.lib.join.MapsideJoinStrategy
Constructs a new instance of the MapsideJoinStratey, materializing the right-side join table to disk before the join is performed.
MapsideJoinStrategy(boolean) - Constructor for class org.apache.crunch.lib.join.MapsideJoinStrategy
Constructs a new instance of the MapsideJoinStrategy.
mapValues(MapFn<Iterable<V>, U>, PType<U>) - Method in class org.apache.crunch.impl.dist.collect.BaseGroupedTable
 
mapValues(String, MapFn<Iterable<V>, U>, PType<U>) - Method in class org.apache.crunch.impl.dist.collect.BaseGroupedTable
 
mapValues(MapFn<V, U>, PType<U>) - Method in class org.apache.crunch.impl.dist.collect.PTableBase
 
mapValues(String, MapFn<V, U>, PType<U>) - Method in class org.apache.crunch.impl.dist.collect.PTableBase
 
mapValues(MapFn<V, U>, PType<U>) - Method in class org.apache.crunch.impl.mem.collect.MemTable
 
mapValues(String, MapFn<V, U>, PType<U>) - Method in class org.apache.crunch.impl.mem.collect.MemTable
 
mapValues(PTable<K, U>, MapFn<U, V>, PType<V>) - Static method in class org.apache.crunch.lib.PTables
Maps a PTable<K, U> to a PTable<K, V> using the given MapFn<U, V> on the values of the PTable.
mapValues(String, PTable<K, U>, MapFn<U, V>, PType<V>) - Static method in class org.apache.crunch.lib.PTables
Maps a PTable<K, U> to a PTable<K, V> using the given MapFn<U, V> on the values of the PTable.
mapValues(PGroupedTable<K, U>, MapFn<Iterable<U>, V>, PType<V>) - Static method in class org.apache.crunch.lib.PTables
An analogue of the mapValues function for PGroupedTable<K, U> collections.
mapValues(String, PGroupedTable<K, U>, MapFn<Iterable<U>, V>, PType<V>) - Static method in class org.apache.crunch.lib.PTables
An analogue of the mapValues function for PGroupedTable<K, U> collections.
mapValues(MapFn<Iterable<V>, U>, PType<U>) - Method in interface org.apache.crunch.PGroupedTable
Maps the Iterable<V> elements of each record to a new type.
mapValues(String, MapFn<Iterable<V>, U>, PType<U>) - Method in interface org.apache.crunch.PGroupedTable
Maps the Iterable<V> elements of each record to a new type.
mapValues(MapFn<V, U>, PType<U>) - Method in interface org.apache.crunch.PTable
Returns a PTable that has the same keys as this instance, but uses the given function to map the values.
mapValues(String, MapFn<V, U>, PType<U>) - Method in interface org.apache.crunch.PTable
Returns a PTable that has the same keys as this instance, but uses the given function to map the values.
markLogged() - Method in exception org.apache.crunch.CrunchRuntimeException
Indicate that this exception has been written to the debug logs.
MaterializableIterable<E> - Class in org.apache.crunch.materialize
 
MaterializableIterable(Pipeline, ReadableSource<E>) - Constructor for class org.apache.crunch.materialize.MaterializableIterable
 
MaterializableMap<K,V> - Class in org.apache.crunch.materialize
 
MaterializableMap(Iterable<Pair<K, V>>) - Constructor for class org.apache.crunch.materialize.MaterializableMap
 
materialize() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
materialize() - Method in class org.apache.crunch.impl.mem.collect.MemCollection
 
materialize(PCollection<T>) - Method in class org.apache.crunch.impl.mem.MemPipeline
 
materialize(PCollection<T>) - Method in class org.apache.crunch.impl.mr.MRPipeline
 
materialize(PCollection<T>) - Method in class org.apache.crunch.impl.spark.SparkPipeline
 
materialize() - Method in class org.apache.crunch.materialize.MaterializableIterable
 
materialize() - Method in interface org.apache.crunch.PCollection
Returns a reference to the data set represented by this PCollection that may be used by the client to read the data locally.
materialize(PCollection<T>) - Method in interface org.apache.crunch.Pipeline
Create the given PCollection and read the data it contains into the returned Collection instance for client use.
materialize(PCollection<T>) - Method in class org.apache.crunch.util.CrunchTool
 
materializeAt(SourceTarget<S>) - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
materializedAt - Variable in class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
materializedData() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
materializeToMap() - Method in class org.apache.crunch.impl.dist.collect.PTableBase
Returns a Map made up of the keys and values in this PTable.
materializeToMap() - Method in class org.apache.crunch.impl.mem.collect.MemTable
 
materializeToMap() - Method in interface org.apache.crunch.PTable
Returns a Map made up of the keys and values in this PTable.
max() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
max() - Method in class org.apache.crunch.impl.mem.collect.MemCollection
 
max(PCollection<S>) - Static method in class org.apache.crunch.lib.Aggregate
Returns the largest numerical element from the input collection.
max() - Method in interface org.apache.crunch.PCollection
Returns a PObject of the maximum element of this instance.
MAX_BIGINTS() - Static method in class org.apache.crunch.fn.Aggregators
Return the maximum of all given BigInteger values.
MAX_BIGINTS(int) - Static method in class org.apache.crunch.fn.Aggregators
Return the n largest BigInteger values (or fewer if there are fewer values than n).
MAX_DOUBLES() - Static method in class org.apache.crunch.fn.Aggregators
Return the maximum of all given double values.
MAX_DOUBLES(int) - Static method in class org.apache.crunch.fn.Aggregators
Return the n largest double values (or fewer if there are fewer values than n).
MAX_FLOATS() - Static method in class org.apache.crunch.fn.Aggregators
Return the maximum of all given float values.
MAX_FLOATS(int) - Static method in class org.apache.crunch.fn.Aggregators
Return the n largest float values (or fewer if there are fewer values than n).
MAX_INTS() - Static method in class org.apache.crunch.fn.Aggregators
Return the maximum of all given int values.
MAX_INTS(int) - Static method in class org.apache.crunch.fn.Aggregators
Return the n largest int values (or fewer if there are fewer values than n).
MAX_LONGS() - Static method in class org.apache.crunch.fn.Aggregators
Return the maximum of all given long values.
MAX_LONGS(int) - Static method in class org.apache.crunch.fn.Aggregators
Return the n largest long values (or fewer if there are fewer values than n).
MAX_N(int, Class<V>) - Static method in class org.apache.crunch.fn.Aggregators
Return the n largest values (or fewer if there are fewer values than n).
MAX_REDUCERS - Static variable in class org.apache.crunch.util.PartitionUtils
Set an upper limit on the number of reducers the Crunch planner will set for an MR job when it tries to determine how many reducers to use based on the input size.
MAX_RUNNING_JOBS - Static variable in class org.apache.crunch.impl.mr.run.RuntimeParameters
 
MemCollection<S> - Class in org.apache.crunch.impl.mem.collect
 
MemCollection(Iterable<S>) - Constructor for class org.apache.crunch.impl.mem.collect.MemCollection
 
MemCollection(Iterable<S>, PType<S>) - Constructor for class org.apache.crunch.impl.mem.collect.MemCollection
 
MemCollection(Iterable<S>, PType<S>, String) - Constructor for class org.apache.crunch.impl.mem.collect.MemCollection
 
MemPipeline - Class in org.apache.crunch.impl.mem
 
MemTable<K,V> - Class in org.apache.crunch.impl.mem.collect
 
MemTable(Iterable<Pair<K, V>>) - Constructor for class org.apache.crunch.impl.mem.collect.MemTable
 
MemTable(Iterable<Pair<K, V>>, PTableType<K, V>, String) - Constructor for class org.apache.crunch.impl.mem.collect.MemTable
 
meta - Variable in class org.apache.crunch.io.avro.trevni.TrevniRecordWriter
meta data to be stored in the output file.
META_PREFIX - Static variable in class org.apache.crunch.io.avro.trevni.TrevniRecordWriter
prefix of job configs that we care about
min() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
min() - Method in class org.apache.crunch.impl.mem.collect.MemCollection
 
min(PCollection<S>) - Static method in class org.apache.crunch.lib.Aggregate
Returns the smallest numerical element from the input collection.
min() - Method in interface org.apache.crunch.PCollection
Returns a PObject of the minimum element of this instance.
MIN_BIGINTS() - Static method in class org.apache.crunch.fn.Aggregators
Return the minimum of all given BigInteger values.
MIN_BIGINTS(int) - Static method in class org.apache.crunch.fn.Aggregators
Return the n smallest BigInteger values (or fewer if there are fewer values than n).
MIN_DOUBLES() - Static method in class org.apache.crunch.fn.Aggregators
Return the minimum of all given double values.
MIN_DOUBLES(int) - Static method in class org.apache.crunch.fn.Aggregators
Return the n smallest double values (or fewer if there are fewer values than n).
MIN_FLOATS() - Static method in class org.apache.crunch.fn.Aggregators
Return the minimum of all given float values.
MIN_FLOATS(int) - Static method in class org.apache.crunch.fn.Aggregators
Return the n smallest float values (or fewer if there are fewer values than n).
MIN_INTS() - Static method in class org.apache.crunch.fn.Aggregators
Return the minimum of all given int values.
MIN_INTS(int) - Static method in class org.apache.crunch.fn.Aggregators
Return the n smallest int values (or fewer if there are fewer values than n).
MIN_LONGS() - Static method in class org.apache.crunch.fn.Aggregators
Return the minimum of all given long values.
MIN_LONGS(int) - Static method in class org.apache.crunch.fn.Aggregators
Return the n smallest long values (or fewer if there are fewer values than n).
MIN_N(int, Class<V>) - Static method in class org.apache.crunch.fn.Aggregators
Return the n smallest values (or fewer if there are fewer values than n).
MRCollection - Interface in org.apache.crunch.impl.dist.collect
 
MRCollectionFactory - Class in org.apache.crunch.impl.mr.collect
 
MRCollectionFactory() - Constructor for class org.apache.crunch.impl.mr.collect.MRCollectionFactory
 
MRExecutor - Class in org.apache.crunch.impl.mr.exec
Provides APIs for job control at runtime to clients.
MRExecutor(Configuration, Class<?>, Map<PCollectionImpl<?>, Set<Target>>, Map<PCollectionImpl<?>, MaterializableIterable>) - Constructor for class org.apache.crunch.impl.mr.exec.MRExecutor
 
MRJob - Interface in org.apache.crunch.impl.mr
A Hadoop MapReduce job managed by Crunch.
MRJob.State - Enum in org.apache.crunch.impl.mr
A job will be in one of the following states.
MRPipeline - Class in org.apache.crunch.impl.mr
Pipeline implementation that is executed within Hadoop MapReduce.
MRPipeline(Class<?>) - Constructor for class org.apache.crunch.impl.mr.MRPipeline
Instantiate with a default Configuration and name.
MRPipeline(Class<?>, String) - Constructor for class org.apache.crunch.impl.mr.MRPipeline
Instantiate with a custom pipeline name.
MRPipeline(Class<?>, Configuration) - Constructor for class org.apache.crunch.impl.mr.MRPipeline
Instantiate with a custom configuration and default naming.
MRPipeline(Class<?>, String, Configuration) - Constructor for class org.apache.crunch.impl.mr.MRPipeline
Instantiate with a custom name and configuration.
MRPipelineExecution - Interface in org.apache.crunch.impl.mr
 
MSCROutputHandler - Class in org.apache.crunch.impl.mr.plan
 
MSCROutputHandler(Job, Path, boolean) - Constructor for class org.apache.crunch.impl.mr.plan.MSCROutputHandler
 
MSCRPlanner - Class in org.apache.crunch.impl.mr.plan
 
MSCRPlanner(MRPipeline, Map<PCollectionImpl<?>, Set<Target>>, Map<PCollectionImpl<?>, MaterializableIterable>) - Constructor for class org.apache.crunch.impl.mr.plan.MSCRPlanner
 
MULTI_OUTPUT_PREFIX - Static variable in class org.apache.crunch.impl.mr.plan.PlanningParameters
 
MultipleOutputEmitter<T,K,V> - Class in org.apache.crunch.impl.mr.emit
 
MultipleOutputEmitter(Converter, CrunchOutputs<K, V>, String) - Constructor for class org.apache.crunch.impl.mr.emit.MultipleOutputEmitter
 

N

name - Variable in class org.apache.crunch.contrib.io.jdbc.IdentifiableName
 
newReader(Schema) - Static method in class org.apache.crunch.types.avro.Avros
 
newReader(AvroType<T>) - Static method in class org.apache.crunch.types.avro.Avros
 
newWriter(Schema) - Static method in class org.apache.crunch.types.avro.Avros
 
newWriter(AvroType<T>) - Static method in class org.apache.crunch.types.avro.Avros
 
next() - Method in class org.apache.crunch.contrib.text.Tokenizer
Advance this Tokenizer and return the next String from the Scanner.
next() - Method in class org.apache.crunch.io.impl.AutoClosingIterator
 
next() - Method in class org.apache.crunch.types.PGroupedTableType.HoldLastIterator
 
next() - Method in class org.apache.crunch.util.DoFnIterator
 
nextBoolean() - Method in class org.apache.crunch.contrib.text.Tokenizer
Advance this Tokenizer and return the next Boolean from the Scanner.
nextDouble() - Method in class org.apache.crunch.contrib.text.Tokenizer
Advance this Tokenizer and return the next Double from the Scanner.
nextFloat() - Method in class org.apache.crunch.contrib.text.Tokenizer
Advance this Tokenizer and return the next Float from the Scanner.
nextInt() - Method in class org.apache.crunch.contrib.text.Tokenizer
Advance this Tokenizer and return the next Integer from the Scanner.
nextLong() - Method in class org.apache.crunch.contrib.text.Tokenizer
Advance this Tokenizer and return the next Long from the Scanner.
NLineFileSource<T> - Class in org.apache.crunch.io.text
A Source instance that uses the NLineInputFormat, which gives each map task a fraction of the lines in a text file as input.
NLineFileSource(String, PType<T>, int) - Constructor for class org.apache.crunch.io.text.NLineFileSource
Create a new NLineFileSource instance.
NLineFileSource(Path, PType<T>, int) - Constructor for class org.apache.crunch.io.text.NLineFileSource
Create a new NLineFileSource instance.
NLineFileSource(List<Path>, PType<T>, int) - Constructor for class org.apache.crunch.io.text.NLineFileSource
Create a new NLineFileSource instance.
NodeContext - Enum in org.apache.crunch.impl.mr.run
Enum that is associated with a serialized DoNode instance, so we know how to use it within the context of a particular MR job.
not(FilterFn<S>) - Static method in class org.apache.crunch.fn.FilterFns
Accept an entry if the given filter does not accept it.
nulls() - Static method in class org.apache.crunch.types.avro.Avros
 
nulls() - Method in class org.apache.crunch.types.avro.AvroTypeFamily
 
nulls() - Method in interface org.apache.crunch.types.PTypeFamily
 
nulls() - Static method in class org.apache.crunch.types.writable.Writables
 
nulls() - Method in class org.apache.crunch.types.writable.WritableTypeFamily
 
numPartitions() - Method in class org.apache.crunch.impl.spark.SparkPartitioner
 
numReducers(int) - Method in class org.apache.crunch.GroupingOptions.Builder
 

O

of(T, U) - Static method in class org.apache.crunch.Pair
 
of(A, B, C) - Static method in class org.apache.crunch.Tuple3
 
of(A, B, C, D) - Static method in class org.apache.crunch.Tuple4
 
of(Object...) - Static method in class org.apache.crunch.TupleN
 
OneToManyJoin - Class in org.apache.crunch.lib.join
Optimized join for situations where exactly one value is being joined with any other number of values based on a common key.
OneToManyJoin() - Constructor for class org.apache.crunch.lib.join.OneToManyJoin
 
oneToManyJoin(PTable<K, U>, PTable<K, V>, DoFn<Pair<U, Iterable<V>>, T>, PType<T>) - Static method in class org.apache.crunch.lib.join.OneToManyJoin
Performs a join on two tables, where the left table only contains a single value per key.
oneToManyJoin(PTable<K, U>, PTable<K, V>, DoFn<Pair<U, Iterable<V>>, T>, PType<T>, int) - Static method in class org.apache.crunch.lib.join.OneToManyJoin
Supports a user-specified number of reducers for the one-to-many join.
or(FilterFn<S>, FilterFn<S>) - Static method in class org.apache.crunch.fn.FilterFns
Accept an entry if at least one of the given filters accept it, using short-circuit evaluation.
or(FilterFn<S>...) - Static method in class org.apache.crunch.fn.FilterFns
Accept an entry if at least one of the given filters accept it, using short-circuit evaluation.
order() - Method in class org.apache.crunch.lib.Sort.ColumnOrder
 
org.apache.crunch - package org.apache.crunch
Client-facing API and core abstractions.
org.apache.crunch.contrib - package org.apache.crunch.contrib
User contributions that may be interesting for special applications.
org.apache.crunch.contrib.bloomfilter - package org.apache.crunch.contrib.bloomfilter
Support for creating Bloom Filters.
org.apache.crunch.contrib.io.jdbc - package org.apache.crunch.contrib.io.jdbc
Support for reading data from RDBMS using JDBC
org.apache.crunch.contrib.text - package org.apache.crunch.contrib.text
 
org.apache.crunch.examples - package org.apache.crunch.examples
Example applications demonstrating various aspects of Crunch.
org.apache.crunch.fn - package org.apache.crunch.fn
Commonly used functions for manipulating collections.
org.apache.crunch.impl - package org.apache.crunch.impl
 
org.apache.crunch.impl.dist - package org.apache.crunch.impl.dist
 
org.apache.crunch.impl.dist.collect - package org.apache.crunch.impl.dist.collect
 
org.apache.crunch.impl.mem - package org.apache.crunch.impl.mem
In-memory Pipeline implementation for rapid prototyping and testing.
org.apache.crunch.impl.mem.collect - package org.apache.crunch.impl.mem.collect
 
org.apache.crunch.impl.mem.emit - package org.apache.crunch.impl.mem.emit
 
org.apache.crunch.impl.mr - package org.apache.crunch.impl.mr
A Pipeline implementation that runs on Hadoop MapReduce.
org.apache.crunch.impl.mr.collect - package org.apache.crunch.impl.mr.collect
 
org.apache.crunch.impl.mr.emit - package org.apache.crunch.impl.mr.emit
 
org.apache.crunch.impl.mr.exec - package org.apache.crunch.impl.mr.exec
 
org.apache.crunch.impl.mr.plan - package org.apache.crunch.impl.mr.plan
 
org.apache.crunch.impl.mr.run - package org.apache.crunch.impl.mr.run
 
org.apache.crunch.impl.spark - package org.apache.crunch.impl.spark
 
org.apache.crunch.impl.spark.collect - package org.apache.crunch.impl.spark.collect
 
org.apache.crunch.impl.spark.fn - package org.apache.crunch.impl.spark.fn
 
org.apache.crunch.impl.spark.serde - package org.apache.crunch.impl.spark.serde
 
org.apache.crunch.io - package org.apache.crunch.io
Data input and output for Pipelines.
org.apache.crunch.io.avro - package org.apache.crunch.io.avro
 
org.apache.crunch.io.avro.trevni - package org.apache.crunch.io.avro.trevni
 
org.apache.crunch.io.hbase - package org.apache.crunch.io.hbase
 
org.apache.crunch.io.impl - package org.apache.crunch.io.impl
 
org.apache.crunch.io.parquet - package org.apache.crunch.io.parquet
 
org.apache.crunch.io.seq - package org.apache.crunch.io.seq
 
org.apache.crunch.io.text - package org.apache.crunch.io.text
 
org.apache.crunch.lib - package org.apache.crunch.lib
Joining, sorting, aggregating, and other commonly used functionality.
org.apache.crunch.lib.join - package org.apache.crunch.lib.join
Inner and outer joins on collections.
org.apache.crunch.lib.sort - package org.apache.crunch.lib.sort
 
org.apache.crunch.materialize - package org.apache.crunch.materialize
 
org.apache.crunch.materialize.pobject - package org.apache.crunch.materialize.pobject
 
org.apache.crunch.test - package org.apache.crunch.test
Utilities for testing Crunch-based applications.
org.apache.crunch.types - package org.apache.crunch.types
Common functionality for business object serialization.
org.apache.crunch.types.avro - package org.apache.crunch.types.avro
Business object serialization using Apache Avro.
org.apache.crunch.types.writable - package org.apache.crunch.types.writable
Business object serialization using Hadoop's Writables framework.
org.apache.crunch.util - package org.apache.crunch.util
An assorted set of utilities.
outputConf(String, String) - Method in class org.apache.crunch.io.hbase.HBaseTarget
 
outputConf(String, String) - Method in class org.apache.crunch.io.impl.FileTargetImpl
 
outputConf(String, String) - Method in interface org.apache.crunch.Target
Adds the given key-value pair to the Configuration instance that is used to write this Target.
OutputConverterFunction<K,V,S> - Class in org.apache.crunch.impl.spark.fn
 
OutputConverterFunction(Converter<K, V, S, ?>) - Constructor for class org.apache.crunch.impl.spark.fn.OutputConverterFunction
 
OutputEmitter<T,K,V> - Class in org.apache.crunch.impl.mr.emit
 
OutputEmitter(Converter<K, V, Object, Object>, TaskInputOutputContext<?, ?, K, V>) - Constructor for class org.apache.crunch.impl.mr.emit.OutputEmitter
 
OutputHandler - Interface in org.apache.crunch.io
 
outputKey(V) - Method in class org.apache.crunch.io.hbase.HBaseValueConverter
 
outputKey(S) - Method in interface org.apache.crunch.types.Converter
 
outputTargets - Variable in class org.apache.crunch.impl.dist.DistributedPipeline
 
outputTargetsToMaterialize - Variable in class org.apache.crunch.impl.dist.DistributedPipeline
 
outputValue(V) - Method in class org.apache.crunch.io.hbase.HBaseValueConverter
 
outputValue(S) - Method in interface org.apache.crunch.types.Converter
 
override(ReaderWriterFactory) - Method in enum org.apache.crunch.types.avro.AvroMode
 
overridePathProperties(Configuration) - Method in class org.apache.crunch.test.TemporaryPath
Set all keys specified in the constructor to temporary directories.

P

Pair<K,V> - Class in org.apache.crunch
A convenience class for two-element Tuples.
Pair(K, V) - Constructor for class org.apache.crunch.Pair
 
PAIR - Static variable in class org.apache.crunch.types.TupleFactory
 
pair2tupleFunc() - Static method in class org.apache.crunch.impl.spark.GuavaUtils
 
pairAggregator(Aggregator<V1>, Aggregator<V2>) - Static method in class org.apache.crunch.fn.Aggregators
Apply separate aggregators to each component of a Pair.
PairFlatMapDoFn<T,K,V> - Class in org.apache.crunch.impl.spark.fn
 
PairFlatMapDoFn(DoFn<T, Pair<K, V>>, SparkRuntimeContext) - Constructor for class org.apache.crunch.impl.spark.fn.PairFlatMapDoFn
 
PairFlatMapPairDoFn<K,V,K2,V2> - Class in org.apache.crunch.impl.spark.fn
 
PairFlatMapPairDoFn(DoFn<Pair<K, V>, Pair<K2, V2>>, SparkRuntimeContext) - Constructor for class org.apache.crunch.impl.spark.fn.PairFlatMapPairDoFn
 
PairMapFn<K,V,S,T> - Class in org.apache.crunch.fn
 
PairMapFn(MapFn<K, S>, MapFn<V, T>) - Constructor for class org.apache.crunch.fn.PairMapFn
 
PairMapFunction<K,V,S> - Class in org.apache.crunch.impl.spark.fn
 
PairMapFunction(MapFn<Pair<K, V>, S>, SparkRuntimeContext) - Constructor for class org.apache.crunch.impl.spark.fn.PairMapFunction
 
PairMapIterableFunction<K,V,S,T> - Class in org.apache.crunch.impl.spark.fn
 
PairMapIterableFunction(MapFn<Pair<K, List<V>>, Pair<S, Iterable<T>>>, SparkRuntimeContext) - Constructor for class org.apache.crunch.impl.spark.fn.PairMapIterableFunction
 
pairs(PType<V1>, PType<V2>) - Static method in class org.apache.crunch.types.avro.Avros
 
pairs(PType<V1>, PType<V2>) - Method in class org.apache.crunch.types.avro.AvroTypeFamily
 
pairs(PType<V1>, PType<V2>) - Method in interface org.apache.crunch.types.PTypeFamily
 
pairs(PType<V1>, PType<V2>) - Static method in class org.apache.crunch.types.writable.Writables
 
pairs(PType<V1>, PType<V2>) - Method in class org.apache.crunch.types.writable.WritableTypeFamily
 
parallelDo(DoFn<S, T>, PType<T>) - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
parallelDo(String, DoFn<S, T>, PType<T>) - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
parallelDo(String, DoFn<S, T>, PType<T>, ParallelDoOptions) - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
parallelDo(DoFn<S, Pair<K, V>>, PTableType<K, V>) - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
parallelDo(String, DoFn<S, Pair<K, V>>, PTableType<K, V>) - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
parallelDo(String, DoFn<S, Pair<K, V>>, PTableType<K, V>, ParallelDoOptions) - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
parallelDo(DoFn<S, T>, PType<T>) - Method in class org.apache.crunch.impl.mem.collect.MemCollection
 
parallelDo(String, DoFn<S, T>, PType<T>) - Method in class org.apache.crunch.impl.mem.collect.MemCollection
 
parallelDo(String, DoFn<S, T>, PType<T>, ParallelDoOptions) - Method in class org.apache.crunch.impl.mem.collect.MemCollection
 
parallelDo(DoFn<S, Pair<K, V>>, PTableType<K, V>) - Method in class org.apache.crunch.impl.mem.collect.MemCollection
 
parallelDo(String, DoFn<S, Pair<K, V>>, PTableType<K, V>) - Method in class org.apache.crunch.impl.mem.collect.MemCollection
 
parallelDo(String, DoFn<S, Pair<K, V>>, PTableType<K, V>, ParallelDoOptions) - Method in class org.apache.crunch.impl.mem.collect.MemCollection
 
parallelDo(DoFn<S, T>, PType<T>) - Method in interface org.apache.crunch.PCollection
Applies the given doFn to the elements of this PCollection and returns a new PCollection that is the output of this processing.
parallelDo(String, DoFn<S, T>, PType<T>) - Method in interface org.apache.crunch.PCollection
Applies the given doFn to the elements of this PCollection and returns a new PCollection that is the output of this processing.
parallelDo(String, DoFn<S, T>, PType<T>, ParallelDoOptions) - Method in interface org.apache.crunch.PCollection
Applies the given doFn to the elements of this PCollection and returns a new PCollection that is the output of this processing.
parallelDo(DoFn<S, Pair<K, V>>, PTableType<K, V>) - Method in interface org.apache.crunch.PCollection
Similar to the other parallelDo instance, but returns a PTable instance instead of a PCollection.
parallelDo(String, DoFn<S, Pair<K, V>>, PTableType<K, V>) - Method in interface org.apache.crunch.PCollection
Similar to the other parallelDo instance, but returns a PTable instance instead of a PCollection.
parallelDo(String, DoFn<S, Pair<K, V>>, PTableType<K, V>, ParallelDoOptions) - Method in interface org.apache.crunch.PCollection
Similar to the other parallelDo instance, but returns a PTable instance instead of a PCollection.
ParallelDoOptions - Class in org.apache.crunch
Container class that includes optional information about a parallelDo operation applied to a PCollection.
ParallelDoOptions.Builder - Class in org.apache.crunch
 
ParallelDoOptions.Builder() - Constructor for class org.apache.crunch.ParallelDoOptions.Builder
 
parent - Variable in class org.apache.crunch.impl.dist.collect.BaseGroupedTable
 
Parse - Class in org.apache.crunch.contrib.text
Methods for parsing instances of PCollection<String> into PCollection's of strongly-typed tuples.
parse(String, PCollection<String>, Extractor<T>) - Static method in class org.apache.crunch.contrib.text.Parse
Parses the lines of the input PCollection<String> and returns a PCollection<T> using the given Extractor<T>.
parse(String, PCollection<String>, PTypeFamily, Extractor<T>) - Static method in class org.apache.crunch.contrib.text.Parse
Parses the lines of the input PCollection<String> and returns a PCollection<T> using the given Extractor<T> that uses the given PTypeFamily.
parseTable(String, PCollection<String>, Extractor<Pair<K, V>>) - Static method in class org.apache.crunch.contrib.text.Parse
Parses the lines of the input PCollection<String> and returns a PTable<K, V> using the given Extractor<Pair<K, V>>.
parseTable(String, PCollection<String>, PTypeFamily, Extractor<Pair<K, V>>) - Static method in class org.apache.crunch.contrib.text.Parse
Parses the lines of the input PCollection<String> and returns a PTable<K, V> using the given Extractor<Pair<K, V>> that uses the given PTypeFamily.
part - Variable in class org.apache.crunch.io.avro.trevni.TrevniRecordWriter
Counter that increments as new trevni files are create because the current file has exceeded the block size
partition - Variable in class org.apache.crunch.impl.spark.IntByteArray
 
PartitionedMapOutputFunction<K,V> - Class in org.apache.crunch.impl.spark.fn
 
PartitionedMapOutputFunction(SerDe<K>, SerDe<V>, PGroupedTableType<K, V>, Class<? extends Partitioner>, int, SparkRuntimeContext) - Constructor for class org.apache.crunch.impl.spark.fn.PartitionedMapOutputFunction
 
PARTITIONER_PATH - Static variable in class org.apache.crunch.lib.sort.TotalOrderPartitioner
 
partitionerClass(Class<? extends Partitioner>) - Method in class org.apache.crunch.GroupingOptions.Builder
 
PartitionUtils - Class in org.apache.crunch.util
Helper functions and settings for determining the number of reducers to use in a pipeline job created by the Crunch planner.
PartitionUtils() - Constructor for class org.apache.crunch.util.PartitionUtils
 
path - Variable in class org.apache.crunch.io.impl.FileSourceImpl
Deprecated. 
path - Variable in class org.apache.crunch.io.impl.FileTargetImpl
 
paths - Variable in class org.apache.crunch.io.impl.FileSourceImpl
 
pathsAsString() - Method in class org.apache.crunch.io.impl.FileSourceImpl
 
PathTarget - Interface in org.apache.crunch.io
A target whose output goes to a given path on a file system.
PCollection<S> - Interface in org.apache.crunch
A representation of an immutable, distributed collection of elements that is the fundamental target of computations in Crunch.
PCollectionFactory - Interface in org.apache.crunch.impl.dist.collect
 
PCollectionImpl<S> - Class in org.apache.crunch.impl.dist.collect
 
PCollectionImpl(String, DistributedPipeline) - Constructor for class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
PCollectionImpl(String, DistributedPipeline, ParallelDoOptions) - Constructor for class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
PCollectionImpl.Visitor - Interface in org.apache.crunch.impl.dist.collect
 
PGroupedTable<K,V> - Interface in org.apache.crunch
The Crunch representation of a grouped PTable, which corresponds to the output of the shuffle phase of a MapReduce job.
PGroupedTableImpl<K,V> - Class in org.apache.crunch.impl.mr.collect
 
PGroupedTableImpl<K,V> - Class in org.apache.crunch.impl.spark.collect
 
PGroupedTableType<K,V> - Class in org.apache.crunch.types
The PType instance for PGroupedTable instances.
PGroupedTableType(PTableType<K, V>) - Constructor for class org.apache.crunch.types.PGroupedTableType
 
PGroupedTableType.HoldLastIterator<V> - Class in org.apache.crunch.types
 
PGroupedTableType.HoldLastIterator(MapFn<Object, V>) - Constructor for class org.apache.crunch.types.PGroupedTableType.HoldLastIterator
 
PGroupedTableType.PairIterableMapFn<K,V> - Class in org.apache.crunch.types
 
PGroupedTableType.PairIterableMapFn(MapFn<Object, K>, MapFn<Object, V>) - Constructor for class org.apache.crunch.types.PGroupedTableType.PairIterableMapFn
 
PGroupedTableType.PTypeIterable<V> - Class in org.apache.crunch.types
 
PGroupedTableType.PTypeIterable(MapFn<Object, V>, Iterable<Object>) - Constructor for class org.apache.crunch.types.PGroupedTableType.PTypeIterable
 
pipeline - Variable in class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
Pipeline - Interface in org.apache.crunch
Manages the state of a pipeline execution.
PIPELINE_PLAN_DOTFILE - Static variable in class org.apache.crunch.impl.mr.plan.PlanningParameters
Configuration key under which a DOT file containing the pipeline job graph is stored by the planner.
PipelineExecution - Interface in org.apache.crunch
A handle to allow clients to control a Crunch pipeline as it runs.
PipelineExecution.Status - Enum in org.apache.crunch
 
PipelineResult - Class in org.apache.crunch
Container for the results of a call to run or done on the Pipeline interface that includes details and statistics about the component stages of the data pipeline.
PipelineResult(List<PipelineResult.StageResult>, PipelineExecution.Status) - Constructor for class org.apache.crunch.PipelineResult
 
PipelineResult.StageResult - Class in org.apache.crunch
 
PipelineResult.StageResult(String, Counters) - Constructor for class org.apache.crunch.PipelineResult.StageResult
 
PipelineResult.StageResult(String, String, Counters) - Constructor for class org.apache.crunch.PipelineResult.StageResult
 
plan() - Method in class org.apache.crunch.impl.mr.MRPipeline
 
plan(Class<?>, Configuration) - Method in class org.apache.crunch.impl.mr.plan.MSCRPlanner
 
PlanningParameters - Class in org.apache.crunch.impl.mr.plan
Collection of Configuration keys and various constants used when planning MapReduce jobs for a pipeline.
PObject<T> - Interface in org.apache.crunch
A PObject represents a singleton object value that results from a distributed computation.
PObjectImpl<S,T> - Class in org.apache.crunch.materialize.pobject
An abstract implementation of PObject that is backed by a PCollection.
PObjectImpl(PCollection<S>) - Constructor for class org.apache.crunch.materialize.pobject.PObjectImpl
Constructs a new instance of this PObject implementation.
process(S, Emitter<Pair<String, BloomFilter>>) - Method in class org.apache.crunch.contrib.bloomfilter.BloomFilterFn
 
process(S, Emitter<T>) - Method in class org.apache.crunch.DoFn
Processes the records from a PCollection.
process(T, Emitter<T>) - Method in class org.apache.crunch.FilterFn
 
process(Object) - Method in class org.apache.crunch.impl.mr.run.RTNode
 
process(Object, Object) - Method in class org.apache.crunch.impl.mr.run.RTNode
 
process(Pair<Integer, Iterable<Pair<K, V>>>, Emitter<Pair<Integer, Pair<K, V>>>) - Method in class org.apache.crunch.lib.Aggregate.TopKCombineFn
 
process(Pair<K, V>, Emitter<Pair<Integer, Pair<K, V>>>) - Method in class org.apache.crunch.lib.Aggregate.TopKFn
 
process(Pair<Pair<K, Integer>, Iterable<Pair<U, V>>>, Emitter<Pair<K, Pair<U, V>>>) - Method in class org.apache.crunch.lib.join.JoinFn
Split up the input record to make coding a bit more manageable.
process(S, Emitter<T>) - Method in class org.apache.crunch.MapFn
 
process(Iterable<S>) - Method in class org.apache.crunch.materialize.pobject.CollectionPObject
Transforms the provided Iterable, obtained from the backing PCollection, into the value encapsulated by this PObject.
process(Iterable<T>) - Method in class org.apache.crunch.materialize.pobject.FirstElementPObject
Transforms the provided Iterable, obtained from the backing PCollection, into the value encapsulated by this PObject.
process(Iterable<Pair<K, V>>) - Method in class org.apache.crunch.materialize.pobject.MapPObject
Transforms the provided Iterable, obtained from the backing PCollection, into the value encapsulated by this PObject.
process(Iterable<S>) - Method in class org.apache.crunch.materialize.pobject.PObjectImpl
Transforms the provided Iterable, obtained from the backing PCollection, into the value encapsulated by this PObject.
processIterable(Object, Iterable) - Method in class org.apache.crunch.impl.mr.run.RTNode
 
progress() - Method in class org.apache.crunch.DoFn
 
Protos - Class in org.apache.crunch.types
Utility functions for working with protocol buffers in Crunch.
Protos() - Constructor for class org.apache.crunch.types.Protos
 
protos(Class<T>, PTypeFamily) - Static method in class org.apache.crunch.types.PTypes
 
PTable<K,V> - Interface in org.apache.crunch
A sub-interface of PCollection that represents an immutable, distributed multi-map of keys and values.
PTableBase<K,V> - Class in org.apache.crunch.impl.dist.collect
 
PTableBase(String, DistributedPipeline) - Constructor for class org.apache.crunch.impl.dist.collect.PTableBase
 
PTableBase(String, DistributedPipeline, ParallelDoOptions) - Constructor for class org.apache.crunch.impl.dist.collect.PTableBase
 
PTables - Class in org.apache.crunch.lib
Methods for performing common operations on PTables.
PTables() - Constructor for class org.apache.crunch.lib.PTables
 
PTableType<K,V> - Interface in org.apache.crunch.types
An extension of PType specifically for PTable objects.
ptype - Variable in class org.apache.crunch.impl.dist.collect.BaseDoCollection
 
ptype - Variable in class org.apache.crunch.impl.dist.collect.BaseGroupedTable
 
ptype - Variable in class org.apache.crunch.io.impl.FileSourceImpl
 
PType<T> - Interface in org.apache.crunch.types
A PType defines a mapping between a data type that is used in a Crunch pipeline and a serialization and storage format that is used to read/write data from/to HDFS.
PTypeFamily - Interface in org.apache.crunch.types
An abstract factory for creating PType instances that have the same serialization/storage backing format.
PTypes - Class in org.apache.crunch.types
Utility functions for creating common types of derived PTypes, e.g., for JSON data, protocol buffers, and Thrift records.
PTypes() - Constructor for class org.apache.crunch.types.PTypes
 
PTypeUtils - Class in org.apache.crunch.types
Utilities for converting between PTypes from different PTypeFamily implementations.
puts() - Static method in class org.apache.crunch.io.hbase.HBaseTypes
 

Q

quadAggregator(Aggregator<V1>, Aggregator<V2>, Aggregator<V3>, Aggregator<V4>) - Static method in class org.apache.crunch.fn.Aggregators
Apply separate aggregators to each component of a Tuple4.
quads(PType<V1>, PType<V2>, PType<V3>, PType<V4>) - Static method in class org.apache.crunch.types.avro.Avros
 
quads(PType<V1>, PType<V2>, PType<V3>, PType<V4>) - Method in class org.apache.crunch.types.avro.AvroTypeFamily
 
quads(PType<V1>, PType<V2>, PType<V3>, PType<V4>) - Method in interface org.apache.crunch.types.PTypeFamily
 
quads(PType<V1>, PType<V2>, PType<V3>, PType<V4>) - Static method in class org.apache.crunch.types.writable.Writables
 
quads(PType<V1>, PType<V2>, PType<V3>, PType<V4>) - Method in class org.apache.crunch.types.writable.WritableTypeFamily
 

R

read(Source<S>) - Method in class org.apache.crunch.impl.dist.DistributedPipeline
 
read(TableSource<K, V>) - Method in class org.apache.crunch.impl.dist.DistributedPipeline
 
read(Source<T>) - Method in class org.apache.crunch.impl.mem.MemPipeline
 
read(TableSource<K, V>) - Method in class org.apache.crunch.impl.mem.MemPipeline
 
read(FileSystem, Path) - Method in class org.apache.crunch.io.avro.AvroFileReaderFactory
 
read(Configuration) - Method in class org.apache.crunch.io.avro.AvroFileSource
 
read(FileSystem, Path) - Method in class org.apache.crunch.io.avro.trevni.TrevniFileReaderFactory
 
read(Configuration) - Method in class org.apache.crunch.io.avro.trevni.TrevniKeySource
 
read(FileSystem, Path) - Method in interface org.apache.crunch.io.FileReaderFactory
 
read(TaskInputOutputContext<?, ?, ?, ?>) - Method in class org.apache.crunch.io.hbase.HBaseData
 
read(Configuration) - Method in class org.apache.crunch.io.hbase.HBaseSourceTarget
 
read(FileSystem, Path) - Method in class org.apache.crunch.io.hbase.HFileReaderFactory
 
read(Configuration) - Method in class org.apache.crunch.io.hbase.HFileSource
 
read(Configuration, FileReaderFactory<T>) - Method in class org.apache.crunch.io.impl.FileSourceImpl
 
read(TaskInputOutputContext<?, ?, ?, ?>) - Method in class org.apache.crunch.io.impl.ReadableDataImpl
 
read(Configuration) - Method in class org.apache.crunch.io.impl.ReadableSourcePathTargetImpl
 
read(Configuration) - Method in class org.apache.crunch.io.impl.ReadableSourceTargetImpl
 
read(Configuration) - Method in class org.apache.crunch.io.parquet.AvroParquetFileSource
 
read(Configuration) - Method in interface org.apache.crunch.io.ReadableSource
Returns an Iterable that contains the contents of this source.
read(FileSystem, Path) - Method in class org.apache.crunch.io.seq.SeqFileReaderFactory
 
read(Configuration) - Method in class org.apache.crunch.io.seq.SeqFileSource
 
read(Configuration) - Method in class org.apache.crunch.io.seq.SeqFileTableSource
 
read(Configuration) - Method in class org.apache.crunch.io.text.NLineFileSource
 
read(FileSystem, Path) - Method in class org.apache.crunch.io.text.TextFileReaderFactory
 
read(Configuration) - Method in class org.apache.crunch.io.text.TextFileSource
 
read(Configuration) - Method in class org.apache.crunch.io.text.TextFileTableSource
 
read(Source<T>) - Method in interface org.apache.crunch.Pipeline
Converts the given Source into a PCollection that is available to jobs run using this Pipeline instance.
read(TableSource<K, V>) - Method in interface org.apache.crunch.Pipeline
A version of the read method for TableSource instances that map to PTables.
read(TaskInputOutputContext<?, ?, ?, ?>) - Method in interface org.apache.crunch.ReadableData
Read the data referenced by this instance within the given context.
read(Source<T>) - Method in class org.apache.crunch.util.CrunchTool
 
read(TableSource<K, V>) - Method in class org.apache.crunch.util.CrunchTool
 
read(TaskInputOutputContext<?, ?, ?, ?>) - Method in class org.apache.crunch.util.DelegatingReadableData
 
read(Configuration, Path) - Static method in class org.apache.crunch.util.DistCache
 
read(TaskInputOutputContext<?, ?, ?, ?>) - Method in class org.apache.crunch.util.UnionReadableData
 
ReadableData<T> - Interface in org.apache.crunch
Represents the contents of a data source that can be read on the cluster from within one of the tasks running as part of a Crunch pipeline.
ReadableDataImpl<T> - Class in org.apache.crunch.io.impl
 
ReadableDataImpl(List<Path>) - Constructor for class org.apache.crunch.io.impl.ReadableDataImpl
 
ReadableSource<T> - Interface in org.apache.crunch.io
An extension of the Source interface that indicates that a Source instance may be read as a series of records by the client code.
ReadableSourcePathTargetImpl<T> - Class in org.apache.crunch.io.impl
 
ReadableSourcePathTargetImpl(ReadableSource<T>, PathTarget, FileNamingScheme) - Constructor for class org.apache.crunch.io.impl.ReadableSourcePathTargetImpl
 
ReadableSourceTarget<T> - Interface in org.apache.crunch.io
An interface that indicates that a SourceTarget instance can be read into the local client.
ReadableSourceTargetImpl<T> - Class in org.apache.crunch.io.impl
 
ReadableSourceTargetImpl(ReadableSource<T>, Target) - Constructor for class org.apache.crunch.io.impl.ReadableSourceTargetImpl
 
ReaderWriterFactory - Interface in org.apache.crunch.types.avro
Interface for accessing DatumReader, DatumWriter, and Data classes.
readFields(DataInput) - Method in class org.apache.crunch.contrib.io.jdbc.IdentifiableName
 
readFields(ResultSet) - Method in class org.apache.crunch.contrib.io.jdbc.IdentifiableName
 
readFields(DataInput) - Method in class org.apache.crunch.io.FormatBundle
 
readFields(DataInput) - Method in class org.apache.crunch.types.writable.TupleWritable
readTextFile(String) - Method in class org.apache.crunch.impl.dist.DistributedPipeline
 
readTextFile(String) - Method in class org.apache.crunch.impl.mem.MemPipeline
 
readTextFile(String) - Method in interface org.apache.crunch.Pipeline
A convenience method for reading a text file.
readTextFile(String) - Method in class org.apache.crunch.util.CrunchTool
 
records(Class<T>) - Static method in class org.apache.crunch.types.avro.Avros
 
records(Class<T>) - Method in class org.apache.crunch.types.avro.AvroTypeFamily
 
records(Class<T>) - Method in interface org.apache.crunch.types.PTypeFamily
 
records(Class<T>) - Static method in class org.apache.crunch.types.writable.Writables
 
records(Class<T>) - Method in class org.apache.crunch.types.writable.WritableTypeFamily
 
reduce(Object, Iterable<Object>, Reducer<Object, Object, Object, Object>.Context) - Method in class org.apache.crunch.impl.mr.run.CrunchReducer
 
reduce(PGroupedTable<K1, V1>, Class<? extends Reducer<K1, V1, K2, V2>>, Class<K2>, Class<V2>) - Static method in class org.apache.crunch.lib.Mapred
 
reduce(PGroupedTable<K1, V1>, Class<? extends Reducer<K1, V1, K2, V2>>, Class<K2>, Class<V2>) - Static method in class org.apache.crunch.lib.Mapreduce
 
ReduceGroupingFunction - Class in org.apache.crunch.impl.spark.fn
 
ReduceGroupingFunction(GroupingOptions, PGroupedTableType, SparkRuntimeContext) - Constructor for class org.apache.crunch.impl.spark.fn.ReduceGroupingFunction
 
ReduceInputFunction<K,V> - Class in org.apache.crunch.impl.spark.fn
 
ReduceInputFunction(SerDe<K>, SerDe<V>) - Constructor for class org.apache.crunch.impl.spark.fn.ReduceInputFunction
 
REFLECT_DATA_FACTORY - Static variable in class org.apache.crunch.types.avro.Avros
Deprecated. as of 0.9.0; use AvroMode.REFLECT.override(ReaderWriterFactory)
REFLECT_DATA_FACTORY_CLASS - Static variable in class org.apache.crunch.types.avro.Avros
Deprecated. as of 0.9.0; use AvroMode.REFLECT.override(ReaderWriterFactory)
ReflectDataFactory - Class in org.apache.crunch.types.avro
A Factory class for constructing Avro reflection-related objects.
ReflectDataFactory() - Constructor for class org.apache.crunch.types.avro.ReflectDataFactory
 
reflects(Class<T>) - Static method in class org.apache.crunch.types.avro.Avros
 
register(Class<T>, AvroType<T>) - Static method in class org.apache.crunch.types.avro.Avros
 
register(Class<T>, WritableType<T, ? extends Writable>) - Static method in class org.apache.crunch.types.writable.Writables
 
REJECT_ALL() - Static method in class org.apache.crunch.fn.FilterFns
Reject everything.
remove() - Method in class org.apache.crunch.types.PGroupedTableType.HoldLastIterator
 
remove() - Method in class org.apache.crunch.util.DoFnIterator
 
replicas(int) - Method in class org.apache.crunch.CachingOptions.Builder
 
replicas() - Method in class org.apache.crunch.CachingOptions
Returns the number of replicas of the data that should be maintained in the cache.
requireSortedKeys() - Method in class org.apache.crunch.GroupingOptions.Builder
 
requireSortedKeys() - Method in class org.apache.crunch.GroupingOptions
 
reservoirSample(PCollection<T>, int) - Static method in class org.apache.crunch.lib.Sample
Select a fixed number of elements from the given PCollection with each element equally likely to be included in the sample.
reservorSample(PCollection<T>, int, Long) - Static method in class org.apache.crunch.lib.Sample
A version of the reservoir sampling algorithm that uses a given seed, primarily for testing purposes.
reset() - Method in interface org.apache.crunch.Aggregator
Clears the internal state of this Aggregator and prepares it for the values associated with the next key.
reset(Iterator<Object>) - Method in class org.apache.crunch.types.PGroupedTableType.HoldLastIterator
 
results() - Method in interface org.apache.crunch.Aggregator
Returns the current aggregated state of this instance.
results() - Static method in class org.apache.crunch.io.hbase.HBaseTypes
 
ReverseAvroComparator<T> - Class in org.apache.crunch.lib.sort
 
ReverseAvroComparator() - Constructor for class org.apache.crunch.lib.sort.ReverseAvroComparator
 
ReverseWritableComparator<T> - Class in org.apache.crunch.lib.sort
 
ReverseWritableComparator() - Constructor for class org.apache.crunch.lib.sort.ReverseWritableComparator
 
rightJoin(PTable<K, U>, PTable<K, V>) - Static method in class org.apache.crunch.lib.Join
Performs a right outer join on the specified PTables.
RightOuterJoinFn<K,U,V> - Class in org.apache.crunch.lib.join
Used to perform the last step of an right outer join.
RightOuterJoinFn(PType<K>, PType<U>) - Constructor for class org.apache.crunch.lib.join.RightOuterJoinFn
 
RTNode - Class in org.apache.crunch.impl.mr.run
 
RTNode(DoFn<Object, Object>, PType<Object>, String, List<RTNode>, Converter, Converter, String) - Constructor for class org.apache.crunch.impl.mr.run.RTNode
 
run(String[]) - Method in class org.apache.crunch.examples.AverageBytesByIP
 
run(String[]) - Method in class org.apache.crunch.examples.SecondarySortExample
 
run(String[]) - Method in class org.apache.crunch.examples.SortExample
 
run(String[]) - Method in class org.apache.crunch.examples.TotalBytesByIP
 
run(String[]) - Method in class org.apache.crunch.examples.WordAggregationHBase
 
run(String[]) - Method in class org.apache.crunch.examples.WordCount
 
run() - Method in class org.apache.crunch.impl.mem.MemPipeline
 
run() - Method in class org.apache.crunch.impl.mr.exec.CrunchJobHooks.CompletionHook
 
run() - Method in class org.apache.crunch.impl.mr.exec.CrunchJobHooks.PrepareHook
 
run() - Method in class org.apache.crunch.impl.mr.MRPipeline
 
run() - Method in class org.apache.crunch.impl.spark.SparkPipeline
 
run() - Method in interface org.apache.crunch.Pipeline
Constructs and executes a series of MapReduce jobs in order to write data to the output targets.
run() - Method in class org.apache.crunch.util.CrunchTool
 
runAsync() - Method in class org.apache.crunch.impl.mem.MemPipeline
 
runAsync() - Method in class org.apache.crunch.impl.mr.MRPipeline
 
runAsync() - Method in class org.apache.crunch.impl.spark.SparkPipeline
 
runAsync() - Method in interface org.apache.crunch.Pipeline
Constructs and starts a series of MapReduce jobs in order ot write data to the output targets, but returns a ListenableFuture to allow clients to control job execution.
runAsync() - Method in class org.apache.crunch.util.CrunchTool
 
RuntimeParameters - Class in org.apache.crunch.impl.mr.run
Parameters used during the runtime execution.

S

Sample - Class in org.apache.crunch.lib
Methods for performing random sampling in a distributed fashion, either by accepting each record in a PCollection with an independent probability in order to sample some fraction of the overall data set, or by using reservoir sampling in order to pull a uniform or weighted sample of fixed size from a PCollection of an unknown size.
Sample() - Constructor for class org.apache.crunch.lib.Sample
 
sample(PCollection<S>, double) - Static method in class org.apache.crunch.lib.Sample
Output records from the given PCollection with the given probability.
sample(PCollection<S>, Long, double) - Static method in class org.apache.crunch.lib.Sample
Output records from the given PCollection using a given seed.
sample(PTable<K, V>, double) - Static method in class org.apache.crunch.lib.Sample
A PTable<K, V> analogue of the sample function.
sample(PTable<K, V>, Long, double) - Static method in class org.apache.crunch.lib.Sample
A PTable<K, V> analogue of the sample function, with the seed argument exposed for testing purposes.
SAMPLE_UNIQUE_ELEMENTS(int) - Static method in class org.apache.crunch.fn.Aggregators
Collect a sample of unique elements from the input, where 'unique' is defined by the equals method for the input objects.
scaleFactor() - Method in class org.apache.crunch.DoFn
Returns an estimate of how applying this function to a PCollection will cause it to change in side.
scaleFactor() - Method in class org.apache.crunch.FilterFn
 
scaleFactor() - Method in class org.apache.crunch.MapFn
 
scan - Variable in class org.apache.crunch.io.hbase.HBaseSourceTarget
 
scanHFiles(Pipeline, Path) - Static method in class org.apache.crunch.io.hbase.HFileUtils
 
scanHFiles(Pipeline, Path, Scan) - Static method in class org.apache.crunch.io.hbase.HFileUtils
Scans HFiles with filter conditions.
scanHFiles(Pipeline, List<Path>, Scan) - Static method in class org.apache.crunch.io.hbase.HFileUtils
 
schema - Variable in class org.apache.crunch.io.avro.trevni.TrevniRecordWriter
Provided avro schema from the context
second() - Method in class org.apache.crunch.Pair
 
second() - Method in class org.apache.crunch.Tuple3
 
second() - Method in class org.apache.crunch.Tuple4
 
SecondarySort - Class in org.apache.crunch.lib
Utilities for performing a secondary sort on a PTable<K, Pair<V1, V2>> collection.
SecondarySort() - Constructor for class org.apache.crunch.lib.SecondarySort
 
SecondarySortExample - Class in org.apache.crunch.examples
 
SecondarySortExample() - Constructor for class org.apache.crunch.examples.SecondarySortExample
 
SeqFileReadableData<T> - Class in org.apache.crunch.io.seq
 
SeqFileReadableData(List<Path>, PType<T>) - Constructor for class org.apache.crunch.io.seq.SeqFileReadableData
 
SeqFileReaderFactory<T> - Class in org.apache.crunch.io.seq
 
SeqFileReaderFactory(PType<T>) - Constructor for class org.apache.crunch.io.seq.SeqFileReaderFactory
 
SeqFileReaderFactory(Class) - Constructor for class org.apache.crunch.io.seq.SeqFileReaderFactory
 
SeqFileSource<T> - Class in org.apache.crunch.io.seq
 
SeqFileSource(Path, PType<T>) - Constructor for class org.apache.crunch.io.seq.SeqFileSource
 
SeqFileSource(List<Path>, PType<T>) - Constructor for class org.apache.crunch.io.seq.SeqFileSource
 
SeqFileSourceTarget<T> - Class in org.apache.crunch.io.seq
 
SeqFileSourceTarget(String, PType<T>) - Constructor for class org.apache.crunch.io.seq.SeqFileSourceTarget
 
SeqFileSourceTarget(Path, PType<T>) - Constructor for class org.apache.crunch.io.seq.SeqFileSourceTarget
 
SeqFileSourceTarget(Path, PType<T>, FileNamingScheme) - Constructor for class org.apache.crunch.io.seq.SeqFileSourceTarget
 
SeqFileTableSource<K,V> - Class in org.apache.crunch.io.seq
A TableSource that uses SequenceFileInputFormat to read the input file.
SeqFileTableSource(String, PTableType<K, V>) - Constructor for class org.apache.crunch.io.seq.SeqFileTableSource
 
SeqFileTableSource(Path, PTableType<K, V>) - Constructor for class org.apache.crunch.io.seq.SeqFileTableSource
 
SeqFileTableSource(List<Path>, PTableType<K, V>) - Constructor for class org.apache.crunch.io.seq.SeqFileTableSource
 
SeqFileTableSourceTarget<K,V> - Class in org.apache.crunch.io.seq
 
SeqFileTableSourceTarget(String, PTableType<K, V>) - Constructor for class org.apache.crunch.io.seq.SeqFileTableSourceTarget
 
SeqFileTableSourceTarget(Path, PTableType<K, V>) - Constructor for class org.apache.crunch.io.seq.SeqFileTableSourceTarget
 
SeqFileTableSourceTarget(Path, PTableType<K, V>, FileNamingScheme) - Constructor for class org.apache.crunch.io.seq.SeqFileTableSourceTarget
 
SeqFileTarget - Class in org.apache.crunch.io.seq
 
SeqFileTarget(String) - Constructor for class org.apache.crunch.io.seq.SeqFileTarget
 
SeqFileTarget(Path) - Constructor for class org.apache.crunch.io.seq.SeqFileTarget
 
SeqFileTarget(Path, FileNamingScheme) - Constructor for class org.apache.crunch.io.seq.SeqFileTarget
 
sequenceFile(String, Class<T>) - Static method in class org.apache.crunch.io.At
Creates a SourceTarget<T> instance from the SequenceFile(s) at the given path name from the value field of each key-value pair in the SequenceFile(s).
sequenceFile(Path, Class<T>) - Static method in class org.apache.crunch.io.At
Creates a SourceTarget<T> instance from the SequenceFile(s) at the given Path from the value field of each key-value pair in the SequenceFile(s).
sequenceFile(String, PType<T>) - Static method in class org.apache.crunch.io.At
Creates a SourceTarget<T> instance from the SequenceFile(s) at the given path name from the value field of each key-value pair in the SequenceFile(s).
sequenceFile(Path, PType<T>) - Static method in class org.apache.crunch.io.At
Creates a SourceTarget<T> instance from the SequenceFile(s) at the given Path from the value field of each key-value pair in the SequenceFile(s).
sequenceFile(String, Class<K>, Class<V>) - Static method in class org.apache.crunch.io.At
Creates a TableSourceTarget<K, V> instance from the SequenceFile(s) at the given path name from the key-value pairs in the SequenceFile(s).
sequenceFile(Path, Class<K>, Class<V>) - Static method in class org.apache.crunch.io.At
Creates a TableSourceTarget<K, V> instance from the SequenceFile(s) at the given Path from the key-value pairs in the SequenceFile(s).
sequenceFile(String, PType<K>, PType<V>) - Static method in class org.apache.crunch.io.At
Creates a TableSourceTarget<K, V> instance from the SequenceFile(s) at the given path name from the key-value pairs in the SequenceFile(s).
sequenceFile(Path, PType<K>, PType<V>) - Static method in class org.apache.crunch.io.At
Creates a TableSourceTarget<K, V> instance from the SequenceFile(s) at the given Path from the key-value pairs in the SequenceFile(s).
sequenceFile(String, Class<T>) - Static method in class org.apache.crunch.io.From
Creates a Source<T> instance from the SequenceFile(s) at the given path name from the value field of each key-value pair in the SequenceFile(s).
sequenceFile(Path, Class<T>) - Static method in class org.apache.crunch.io.From
Creates a Source<T> instance from the SequenceFile(s) at the given Path from the value field of each key-value pair in the SequenceFile(s).
sequenceFile(String, PType<T>) - Static method in class org.apache.crunch.io.From
Creates a Source<T> instance from the SequenceFile(s) at the given path name from the value field of each key-value pair in the SequenceFile(s).
sequenceFile(Path, PType<T>) - Static method in class org.apache.crunch.io.From
Creates a Source<T> instance from the SequenceFile(s) at the given Path from the value field of each key-value pair in the SequenceFile(s).
sequenceFile(String, Class<K>, Class<V>) - Static method in class org.apache.crunch.io.From
Creates a TableSource<K, V> instance for the SequenceFile(s) at the given path name.
sequenceFile(Path, Class<K>, Class<V>) - Static method in class org.apache.crunch.io.From
Creates a TableSource<K, V> instance for the SequenceFile(s) at the given Path.
sequenceFile(String, PType<K>, PType<V>) - Static method in class org.apache.crunch.io.From
Creates a TableSource<K, V> instance for the SequenceFile(s) at the given path name.
sequenceFile(Path, PType<K>, PType<V>) - Static method in class org.apache.crunch.io.From
Creates a TableSource<K, V> instance for the SequenceFile(s) at the given Path.
sequenceFile(String) - Static method in class org.apache.crunch.io.To
Creates a Target at the given path name that writes data to SequenceFiles.
sequenceFile(Path) - Static method in class org.apache.crunch.io.To
Creates a Target at the given Path that writes data to SequenceFiles.
SequentialFileNamingScheme - Class in org.apache.crunch.io
Default FileNamingScheme that uses an incrementing sequence number in order to generate unique file names.
SerDe<T> - Interface in org.apache.crunch.impl.spark.serde
 
serialize() - Method in class org.apache.crunch.io.FormatBundle
 
set(String, String) - Method in class org.apache.crunch.io.FormatBundle
 
Set - Class in org.apache.crunch.lib
Utilities for performing set operations (difference, intersection, etc) on PCollection instances.
Set() - Constructor for class org.apache.crunch.lib.Set
 
setBreakpoint() - Method in class org.apache.crunch.impl.dist.collect.BaseUnionCollection
 
setBreakpoint() - Method in class org.apache.crunch.impl.dist.collect.BaseUnionTable
 
setBreakpoint() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
setCombineFn(CombineFn) - Method in class org.apache.crunch.impl.spark.SparkRuntime
 
setConf(Configuration) - Method in class org.apache.crunch.io.FormatBundle
 
setConf(Configuration) - Method in class org.apache.crunch.lib.join.JoinUtils.AvroPairGroupingComparator
 
setConf(Configuration) - Method in class org.apache.crunch.lib.sort.ReverseAvroComparator
 
setConf(Configuration) - Method in class org.apache.crunch.lib.sort.ReverseWritableComparator
 
setConf(Configuration) - Method in class org.apache.crunch.lib.sort.TotalOrderPartitioner
 
setConf(Configuration) - Method in class org.apache.crunch.lib.sort.TupleWritableComparator
 
setConf(Configuration) - Method in class org.apache.crunch.util.CrunchTool
 
setConfiguration(Configuration) - Method in class org.apache.crunch.DoFn
Called during the setup of an initialized PType that relies on this instance.
setConfiguration(Configuration) - Method in class org.apache.crunch.fn.CompositeMapFn
 
setConfiguration(Configuration) - Method in class org.apache.crunch.fn.ExtractKeyFn
 
setConfiguration(Configuration) - Method in class org.apache.crunch.fn.PairMapFn
 
setConfiguration(Configuration) - Method in class org.apache.crunch.impl.dist.DistributedPipeline
 
setConfiguration(Configuration) - Method in class org.apache.crunch.impl.mem.MemPipeline
 
setConfiguration(Configuration) - Method in interface org.apache.crunch.Pipeline
Set the Configuration to use with this pipeline.
setContext(TaskInputOutputContext<?, ?, ?, ?>) - Method in class org.apache.crunch.DoFn
Called during setup to pass the TaskInputOutputContext to this DoFn instance.
setContext(TaskInputOutputContext<?, ?, ?, ?>) - Method in class org.apache.crunch.fn.CompositeMapFn
 
setContext(TaskInputOutputContext<?, ?, ?, ?>) - Method in class org.apache.crunch.fn.ExtractKeyFn
 
setContext(TaskInputOutputContext<?, ?, ?, ?>) - Method in class org.apache.crunch.fn.PairMapFn
 
setContext(TaskInputOutputContext<?, ?, ?, ?>) - Method in class org.apache.crunch.types.PGroupedTableType.PairIterableMapFn
 
setOutputName(String) - Method in class org.apache.crunch.impl.mr.plan.DoNode
 
setParent(SourceTarget<?>) - Method in class org.apache.crunch.io.impl.ReadableDataImpl
 
setPartitionFile(Configuration, Path) - Static method in class org.apache.crunch.lib.sort.TotalOrderPartitioner
 
setPlanDotFile(String) - Method in class org.apache.crunch.impl.mr.exec.MRExecutor
 
setSpecificClassLoader(ClassLoader) - Static method in enum org.apache.crunch.types.avro.AvroMode
 
setStatus(String) - Method in class org.apache.crunch.DoFn
 
setup(Mapper<Object, Object, Object, Object>.Context) - Method in class org.apache.crunch.impl.mr.run.CrunchMapper
 
setup(Reducer<Object, Object, Object, Object>.Context) - Method in class org.apache.crunch.impl.mr.run.CrunchReducer
 
setWritableClasses(List<Class<Writable>>) - Method in class org.apache.crunch.types.writable.TupleWritable
 
setWritten(int) - Method in class org.apache.crunch.types.writable.TupleWritable
Record that the tuple contains an element at the position provided.
Shard - Class in org.apache.crunch.lib
Utilities for controlling how the data in a PCollection is balanced across reducers and output files.
Shard() - Constructor for class org.apache.crunch.lib.Shard
 
shard(PCollection<T>, int) - Static method in class org.apache.crunch.lib.Shard
Creates a PCollection<T> that has the same contents as its input argument but will be written to a fixed number of output files.
ShardedJoinStrategy<K,U,V> - Class in org.apache.crunch.lib.join
JoinStrategy that splits the key space up into shards.
ShardedJoinStrategy(int) - Constructor for class org.apache.crunch.lib.join.ShardedJoinStrategy
Instantiate with a constant number of shards to use for all keys.
ShardedJoinStrategy(ShardedJoinStrategy.ShardingStrategy<K>) - Constructor for class org.apache.crunch.lib.join.ShardedJoinStrategy
Instantiate with a custom sharding strategy.
ShardedJoinStrategy.ShardingStrategy<K> - Interface in org.apache.crunch.lib.join
Determines over how many shards a key will be split in a sharded join.
SingleUseIterable<T> - Class in org.apache.crunch.impl
Wrapper around a Reducer's input Iterable.
SingleUseIterable(Iterable<T>) - Constructor for class org.apache.crunch.impl.SingleUseIterable
Instantiate around an Iterable that may only be used once.
size() - Method in class org.apache.crunch.Pair
 
size() - Method in interface org.apache.crunch.Tuple
Returns the number of elements in this Tuple.
size() - Method in class org.apache.crunch.Tuple3
 
size() - Method in class org.apache.crunch.Tuple4
 
size() - Method in class org.apache.crunch.TupleN
 
size() - Method in class org.apache.crunch.types.writable.TupleWritable
The number of children in this Tuple.
skip(String) - Method in class org.apache.crunch.contrib.text.TokenizerFactory.Builder
Sets the regular expression that determines which input characters should be ignored by the Scanner that is returned by the constructed TokenizerFactory.
Sort - Class in org.apache.crunch.lib
Utilities for sorting PCollection instances.
Sort() - Constructor for class org.apache.crunch.lib.Sort
 
sort(PCollection<T>) - Static method in class org.apache.crunch.lib.Sort
Sorts the PCollection using the natural ordering of its elements in ascending order.
sort(PCollection<T>, Sort.Order) - Static method in class org.apache.crunch.lib.Sort
Sorts the PCollection using the natural order of its elements with the given Order.
sort(PCollection<T>, int, Sort.Order) - Static method in class org.apache.crunch.lib.Sort
Sorts the PCollection using the natural ordering of its elements in the order specified using the given number of reducers.
sort(PTable<K, V>) - Static method in class org.apache.crunch.lib.Sort
Sorts the PTable using the natural ordering of its keys in ascending order.
sort(PTable<K, V>, Sort.Order) - Static method in class org.apache.crunch.lib.Sort
Sorts the PTable using the natural ordering of its keys with the given Order.
sort(PTable<K, V>, int, Sort.Order) - Static method in class org.apache.crunch.lib.Sort
Sorts the PTable using the natural ordering of its keys in the order specified with a client-specified number of reducers.
Sort.ColumnOrder - Class in org.apache.crunch.lib
To sort by column 2 ascending then column 1 descending, you would use: sortPairs(coll, by(2, ASCENDING), by(1, DESCENDING)) Column numbering is 1-based.
Sort.ColumnOrder(int, Sort.Order) - Constructor for class org.apache.crunch.lib.Sort.ColumnOrder
 
Sort.Order - Enum in org.apache.crunch.lib
For signaling the order in which a sort should be done.
sortAndApply(PTable<K, Pair<V1, V2>>, DoFn<Pair<K, Iterable<Pair<V1, V2>>>, T>, PType<T>) - Static method in class org.apache.crunch.lib.SecondarySort
Perform a secondary sort on the given PTable instance and then apply a DoFn to the resulting sorted data to yield an output PCollection<T>.
sortAndApply(PTable<K, Pair<V1, V2>>, DoFn<Pair<K, Iterable<Pair<V1, V2>>>, T>, PType<T>, int) - Static method in class org.apache.crunch.lib.SecondarySort
Perform a secondary sort on the given PTable instance and then apply a DoFn to the resulting sorted data to yield an output PCollection<T>, using the given number of reducers.
sortAndApply(PTable<K, Pair<V1, V2>>, DoFn<Pair<K, Iterable<Pair<V1, V2>>>, Pair<U, V>>, PTableType<U, V>) - Static method in class org.apache.crunch.lib.SecondarySort
Perform a secondary sort on the given PTable instance and then apply a DoFn to the resulting sorted data to yield an output PTable<U, V>.
sortAndApply(PTable<K, Pair<V1, V2>>, DoFn<Pair<K, Iterable<Pair<V1, V2>>>, Pair<U, V>>, PTableType<U, V>, int) - Static method in class org.apache.crunch.lib.SecondarySort
Perform a secondary sort on the given PTable instance and then apply a DoFn to the resulting sorted data to yield an output PTable<U, V>, using the given number of reducers.
sortAndPartition(PCollection<KeyValue>, HTable) - Static method in class org.apache.crunch.io.hbase.HFileUtils
 
sortComparatorClass(Class<? extends RawComparator>) - Method in class org.apache.crunch.GroupingOptions.Builder
 
SortExample - Class in org.apache.crunch.examples
Simple Crunch tool for running sorting examples from the command line.
SortExample() - Constructor for class org.apache.crunch.examples.SortExample
 
SortFns - Class in org.apache.crunch.lib.sort
A set of DoFns that are used by Crunch's Sort library.
SortFns() - Constructor for class org.apache.crunch.lib.sort.SortFns
 
SortFns.AvroGenericFn<V extends Tuple> - Class in org.apache.crunch.lib.sort
Pulls a composite set of keys from an Avro GenericRecord instance.
SortFns.AvroGenericFn(int[], Schema) - Constructor for class org.apache.crunch.lib.sort.SortFns.AvroGenericFn
 
SortFns.KeyExtraction<V extends Tuple> - Class in org.apache.crunch.lib.sort
Utility class for encapsulating key extraction logic and serialization information about key extraction.
SortFns.KeyExtraction(PType<V>, Sort.ColumnOrder[]) - Constructor for class org.apache.crunch.lib.sort.SortFns.KeyExtraction
 
SortFns.SingleKeyFn<V extends Tuple,K> - Class in org.apache.crunch.lib.sort
Extracts a single indexed key from a Tuple instance.
SortFns.SingleKeyFn(int) - Constructor for class org.apache.crunch.lib.sort.SortFns.SingleKeyFn
 
SortFns.TupleKeyFn<V extends Tuple,K extends Tuple> - Class in org.apache.crunch.lib.sort
Extracts a composite key from a Tuple instance.
SortFns.TupleKeyFn(int[], TupleFactory) - Constructor for class org.apache.crunch.lib.sort.SortFns.TupleKeyFn
 
sortPairs(PCollection<Pair<U, V>>, Sort.ColumnOrder...) - Static method in class org.apache.crunch.lib.Sort
Sorts the PCollection of Pairs using the specified column ordering.
sortQuads(PCollection<Tuple4<V1, V2, V3, V4>>, Sort.ColumnOrder...) - Static method in class org.apache.crunch.lib.Sort
Sorts the PCollection of Tuple4s using the specified column ordering.
sortTriples(PCollection<Tuple3<V1, V2, V3>>, Sort.ColumnOrder...) - Static method in class org.apache.crunch.lib.Sort
Sorts the PCollection of Tuple3s using the specified column ordering.
sortTuples(PCollection<T>, Sort.ColumnOrder...) - Static method in class org.apache.crunch.lib.Sort
Sorts the PCollection of tuples using the specified column ordering.
sortTuples(PCollection<T>, int, Sort.ColumnOrder...) - Static method in class org.apache.crunch.lib.Sort
Sorts the PCollection of TupleNs using the specified column ordering and a client-specified number of reducers.
source - Variable in class org.apache.crunch.impl.dist.collect.BaseInputCollection
 
source - Variable in class org.apache.crunch.impl.dist.collect.BaseInputTable
 
Source<T> - Interface in org.apache.crunch
A Source represents an input data set that is an input to one or more MapReduce jobs.
SourcePathTargetImpl<T> - Class in org.apache.crunch.io.impl
 
SourcePathTargetImpl(Source<T>, PathTarget, FileNamingScheme) - Constructor for class org.apache.crunch.io.impl.SourcePathTargetImpl
 
sources(Source<?>...) - Method in class org.apache.crunch.ParallelDoOptions.Builder
 
sources(Collection<Source<?>>) - Method in class org.apache.crunch.ParallelDoOptions.Builder
 
sourceTarget(SourceTarget<?>) - Method in class org.apache.crunch.GroupingOptions.Builder
Deprecated. 
SourceTarget<T> - Interface in org.apache.crunch
An interface for classes that implement both the Source and the Target interfaces.
SourceTargetHelper - Class in org.apache.crunch.io
Functions for configuring the inputs/outputs of MapReduce jobs.
SourceTargetHelper() - Constructor for class org.apache.crunch.io.SourceTargetHelper
 
sourceTargets(SourceTarget<?>...) - Method in class org.apache.crunch.GroupingOptions.Builder
 
sourceTargets(Collection<SourceTarget<?>>) - Method in class org.apache.crunch.GroupingOptions.Builder
 
sourceTargets(SourceTarget<?>...) - Method in class org.apache.crunch.ParallelDoOptions.Builder
 
sourceTargets(Collection<SourceTarget<?>>) - Method in class org.apache.crunch.ParallelDoOptions.Builder
 
SparkCollectFactory - Class in org.apache.crunch.impl.spark.collect
 
SparkCollectFactory() - Constructor for class org.apache.crunch.impl.spark.collect.SparkCollectFactory
 
SparkCollection - Interface in org.apache.crunch.impl.spark
 
SparkComparator - Class in org.apache.crunch.impl.spark
 
SparkComparator(GroupingOptions, PGroupedTableType, SparkRuntimeContext) - Constructor for class org.apache.crunch.impl.spark.SparkComparator
 
SparkPartitioner - Class in org.apache.crunch.impl.spark
 
SparkPartitioner(int) - Constructor for class org.apache.crunch.impl.spark.SparkPartitioner
 
SparkPipeline - Class in org.apache.crunch.impl.spark
 
SparkPipeline(String, String) - Constructor for class org.apache.crunch.impl.spark.SparkPipeline
 
SparkPipeline(JavaSparkContext, String) - Constructor for class org.apache.crunch.impl.spark.SparkPipeline
 
SparkRuntime - Class in org.apache.crunch.impl.spark
 
SparkRuntime(SparkPipeline, JavaSparkContext, Configuration, Map<PCollectionImpl<?>, Set<Target>>, Map<PCollectionImpl<?>, MaterializableIterable>, Map<PCollection<?>, StorageLevel>) - Constructor for class org.apache.crunch.impl.spark.SparkRuntime
 
SparkRuntimeContext - Class in org.apache.crunch.impl.spark
 
SparkRuntimeContext(Broadcast<Configuration>, Accumulator<Map<String, Long>>) - Constructor for class org.apache.crunch.impl.spark.SparkRuntimeContext
 
specifics(Class<T>) - Static method in class org.apache.crunch.types.avro.Avros
 
split(PCollection<Pair<T, U>>) - Static method in class org.apache.crunch.lib.Channels
Splits a PCollection of any Pair of objects into a Pair of PCollection}, to allow for the output of a DoFn to be handled using separate channels.
split(PCollection<Pair<T, U>>, PType<T>, PType<U>) - Static method in class org.apache.crunch.lib.Channels
Splits a PCollection of any Pair of objects into a Pair of PCollection}, to allow for the output of a DoFn to be handled using separate channels.
status - Variable in class org.apache.crunch.PipelineResult
 
STRING_CONCAT(String, boolean) - Static method in class org.apache.crunch.fn.Aggregators
Concatenate strings, with a separator between strings.
STRING_CONCAT(String, boolean, long, long) - Static method in class org.apache.crunch.fn.Aggregators
Concatenate strings, with a separator between strings.
STRING_TO_UTF8 - Static variable in class org.apache.crunch.types.avro.Avros
 
strings() - Static method in class org.apache.crunch.types.avro.Avros
 
strings() - Method in class org.apache.crunch.types.avro.AvroTypeFamily
 
strings() - Method in interface org.apache.crunch.types.PTypeFamily
 
strings() - Static method in class org.apache.crunch.types.writable.Writables
 
strings() - Method in class org.apache.crunch.types.writable.WritableTypeFamily
 
succeeded() - Method in class org.apache.crunch.PipelineResult
 
SUM_BIGINTS() - Static method in class org.apache.crunch.fn.Aggregators
Sum up all BigInteger values.
SUM_DOUBLES() - Static method in class org.apache.crunch.fn.Aggregators
Sum up all double values.
SUM_FLOATS() - Static method in class org.apache.crunch.fn.Aggregators
Sum up all float values.
SUM_INTS() - Static method in class org.apache.crunch.fn.Aggregators
Sum up all int values.
SUM_LONGS() - Static method in class org.apache.crunch.fn.Aggregators
Sum up all long values.

T

table(String) - Static method in class org.apache.crunch.io.hbase.AtHBase
 
table(String, Scan) - Static method in class org.apache.crunch.io.hbase.AtHBase
 
table(String) - Static method in class org.apache.crunch.io.hbase.FromHBase
 
table(String, Scan) - Static method in class org.apache.crunch.io.hbase.FromHBase
 
table - Variable in class org.apache.crunch.io.hbase.HBaseTarget
 
table(String) - Static method in class org.apache.crunch.io.hbase.ToHBase
 
tableOf(S, T, Object...) - Static method in class org.apache.crunch.impl.mem.MemPipeline
 
tableOf(Iterable<Pair<S, T>>) - Static method in class org.apache.crunch.impl.mem.MemPipeline
 
tableOf(PType<K>, PType<V>) - Static method in class org.apache.crunch.types.avro.Avros
 
tableOf(PType<K>, PType<V>) - Method in class org.apache.crunch.types.avro.AvroTypeFamily
 
tableOf(PType<K>, PType<V>) - Method in interface org.apache.crunch.types.PTypeFamily
 
tableOf(PType<K>, PType<V>) - Static method in class org.apache.crunch.types.writable.Writables
 
tableOf(PType<K>, PType<V>) - Method in class org.apache.crunch.types.writable.WritableTypeFamily
 
TableSource<K,V> - Interface in org.apache.crunch
The interface Source implementations that return a PTable.
TableSourcePathTargetImpl<K,V> - Class in org.apache.crunch.io.impl
 
TableSourcePathTargetImpl(TableSource<K, V>, PathTarget) - Constructor for class org.apache.crunch.io.impl.TableSourcePathTargetImpl
 
TableSourcePathTargetImpl(TableSource<K, V>, PathTarget, FileNamingScheme) - Constructor for class org.apache.crunch.io.impl.TableSourcePathTargetImpl
 
TableSourceTarget<K,V> - Interface in org.apache.crunch
An interface for classes that implement both the TableSource and the Target interfaces.
TableSourceTargetImpl<K,V> - Class in org.apache.crunch.io.impl
 
TableSourceTargetImpl(TableSource<K, V>, Target) - Constructor for class org.apache.crunch.io.impl.TableSourceTargetImpl
 
tableType - Variable in class org.apache.crunch.types.PGroupedTableType
 
Target - Interface in org.apache.crunch
A Target represents the output destination of a Crunch PCollection in the context of a Crunch job.
Target.WriteMode - Enum in org.apache.crunch
An enum to represent different options the client may specify for handling the case where the output path, table, etc.
tempDir - Variable in class org.apache.crunch.test.CrunchTestSupport
 
TemporaryPath - Class in org.apache.crunch.test
Creates a temporary directory for a test case and destroys it afterwards.
TemporaryPath(String...) - Constructor for class org.apache.crunch.test.TemporaryPath
Construct TemporaryPath.
TestCounters - Class in org.apache.crunch.test
A utility class used during unit testing to update and read counters.
TestCounters() - Constructor for class org.apache.crunch.test.TestCounters
 
textFile(String) - Static method in class org.apache.crunch.io.At
Creates a SourceTarget<String> instance for the text file(s) at the given path name.
textFile(Path) - Static method in class org.apache.crunch.io.At
Creates a SourceTarget<String> instance for the text file(s) at the given Path.
textFile(String, PType<T>) - Static method in class org.apache.crunch.io.At
Creates a SourceTarget<T> instance for the text file(s) at the given path name using the provided PType<T> to convert the input text.
textFile(Path, PType<T>) - Static method in class org.apache.crunch.io.At
Creates a SourceTarget<T> instance for the text file(s) at the given Path using the provided PType<T> to convert the input text.
textFile(String) - Static method in class org.apache.crunch.io.From
Creates a Source<String> instance for the text file(s) at the given path name.
textFile(Path) - Static method in class org.apache.crunch.io.From
Creates a Source<String> instance for the text file(s) at the given Path.
textFile(String, PType<T>) - Static method in class org.apache.crunch.io.From
Creates a Source<T> instance for the text file(s) at the given path name using the provided PType<T> to convert the input text.
textFile(Path, PType<T>) - Static method in class org.apache.crunch.io.From
Creates a Source<T> instance for the text file(s) at the given Path using the provided PType<T> to convert the input text.
textFile(String) - Static method in class org.apache.crunch.io.To
Creates a Target at the given path name that writes data to text files.
textFile(Path) - Static method in class org.apache.crunch.io.To
Creates a Target at the given Path that writes data to text files.
TextFileReaderFactory<T> - Class in org.apache.crunch.io.text
 
TextFileReaderFactory(PType<T>) - Constructor for class org.apache.crunch.io.text.TextFileReaderFactory
 
TextFileReaderFactory(LineParser<T>) - Constructor for class org.apache.crunch.io.text.TextFileReaderFactory
 
TextFileSource<T> - Class in org.apache.crunch.io.text
 
TextFileSource(Path, PType<T>) - Constructor for class org.apache.crunch.io.text.TextFileSource
 
TextFileSource(List<Path>, PType<T>) - Constructor for class org.apache.crunch.io.text.TextFileSource
 
TextFileSourceTarget<T> - Class in org.apache.crunch.io.text
 
TextFileSourceTarget(String, PType<T>) - Constructor for class org.apache.crunch.io.text.TextFileSourceTarget
 
TextFileSourceTarget(Path, PType<T>) - Constructor for class org.apache.crunch.io.text.TextFileSourceTarget
 
TextFileSourceTarget(Path, PType<T>, FileNamingScheme) - Constructor for class org.apache.crunch.io.text.TextFileSourceTarget
 
TextFileTableSource<K,V> - Class in org.apache.crunch.io.text
A Source that uses the KeyValueTextInputFormat to process input text.
TextFileTableSource(String, PTableType<K, V>) - Constructor for class org.apache.crunch.io.text.TextFileTableSource
 
TextFileTableSource(Path, PTableType<K, V>) - Constructor for class org.apache.crunch.io.text.TextFileTableSource
 
TextFileTableSource(List<Path>, PTableType<K, V>) - Constructor for class org.apache.crunch.io.text.TextFileTableSource
 
TextFileTableSource(String, PTableType<K, V>, String) - Constructor for class org.apache.crunch.io.text.TextFileTableSource
 
TextFileTableSource(Path, PTableType<K, V>, String) - Constructor for class org.apache.crunch.io.text.TextFileTableSource
 
TextFileTableSource(List<Path>, PTableType<K, V>, String) - Constructor for class org.apache.crunch.io.text.TextFileTableSource
 
TextFileTableSourceTarget<K,V> - Class in org.apache.crunch.io.text
A TableSource and SourceTarget implementation that uses the KeyValueTextInputFormat and TextOutputFormat to support reading and writing text files as PTable instances using a tab separator for the keys and the values.
TextFileTableSourceTarget(String, PTableType<K, V>) - Constructor for class org.apache.crunch.io.text.TextFileTableSourceTarget
 
TextFileTableSourceTarget(Path, PTableType<K, V>) - Constructor for class org.apache.crunch.io.text.TextFileTableSourceTarget
 
TextFileTableSourceTarget(Path, PTableType<K, V>, FileNamingScheme) - Constructor for class org.apache.crunch.io.text.TextFileTableSourceTarget
 
TextFileTarget - Class in org.apache.crunch.io.text
 
TextFileTarget(String) - Constructor for class org.apache.crunch.io.text.TextFileTarget
 
TextFileTarget(Path) - Constructor for class org.apache.crunch.io.text.TextFileTarget
 
TextFileTarget(Path, FileNamingScheme) - Constructor for class org.apache.crunch.io.text.TextFileTarget
 
TextReadableData<T> - Class in org.apache.crunch.io.text
 
TextReadableData(List<Path>, PType<T>) - Constructor for class org.apache.crunch.io.text.TextReadableData
 
TextReadableData(List<Path>, PType<T>, String) - Constructor for class org.apache.crunch.io.text.TextReadableData
 
third() - Method in class org.apache.crunch.Tuple3
 
third() - Method in class org.apache.crunch.Tuple4
 
thrifts(Class<T>, PTypeFamily) - Static method in class org.apache.crunch.types.PTypes
 
TMP_DIR - Static variable in class org.apache.crunch.impl.mr.run.RuntimeParameters
 
To - Class in org.apache.crunch.io
Static factory methods for creating common Target types.
To() - Constructor for class org.apache.crunch.io.To
 
to - Static variable in class org.apache.crunch.util.CrunchTool
 
ToByteArrayFunction - Class in org.apache.crunch.impl.spark.collect
 
ToByteArrayFunction() - Constructor for class org.apache.crunch.impl.spark.collect.ToByteArrayFunction
 
toBytes(T) - Method in class org.apache.crunch.impl.spark.serde.AvroSerDe
 
toBytes(T) - Method in interface org.apache.crunch.impl.spark.serde.SerDe
 
toBytes(Writable) - Method in class org.apache.crunch.impl.spark.serde.WritableSerDe
 
toCombineFn(Aggregator<V>) - Static method in class org.apache.crunch.fn.Aggregators
Wrap a CombineFn adapter around the given aggregator.
ToHBase - Class in org.apache.crunch.io.hbase
Static factory methods for creating HBase Target types.
ToHBase() - Constructor for class org.apache.crunch.io.hbase.ToHBase
 
Tokenizer - Class in org.apache.crunch.contrib.text
Manages a Scanner instance and provides support for returning only a subset of the fields returned by the underlying Scanner.
Tokenizer(Scanner, Set<Integer>, boolean) - Constructor for class org.apache.crunch.contrib.text.Tokenizer
Create a new Tokenizer instance.
TokenizerFactory - Class in org.apache.crunch.contrib.text
Factory class that constructs Tokenizer instances for input strings that use a fixed set of delimiters, skip patterns, locales, and sets of indices to keep or drop.
TokenizerFactory.Builder - Class in org.apache.crunch.contrib.text
A class for constructing new TokenizerFactory instances using the Builder pattern.
TokenizerFactory.Builder() - Constructor for class org.apache.crunch.contrib.text.TokenizerFactory.Builder
 
top(int) - Method in class org.apache.crunch.impl.dist.collect.PTableBase
 
top(int) - Method in class org.apache.crunch.impl.mem.collect.MemTable
 
top(PTable<K, V>, int, boolean) - Static method in class org.apache.crunch.lib.Aggregate
 
top(int) - Method in interface org.apache.crunch.PTable
Returns a PTable made up of the pairs in this PTable with the largest value field.
toRTNode(boolean, Configuration, NodeContext) - Method in class org.apache.crunch.impl.mr.plan.DoNode
 
toString() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
toString() - Method in class org.apache.crunch.impl.mem.collect.MemCollection
 
toString() - Method in class org.apache.crunch.impl.mr.run.RTNode
 
toString() - Method in class org.apache.crunch.io.avro.AvroFileSource
 
toString() - Method in class org.apache.crunch.io.avro.AvroFileSourceTarget
 
toString() - Method in class org.apache.crunch.io.avro.AvroFileTarget
 
toString() - Method in class org.apache.crunch.io.avro.AvroPathPerKeyTarget
 
toString() - Method in class org.apache.crunch.io.avro.trevni.TrevniKeySource
 
toString() - Method in class org.apache.crunch.io.avro.trevni.TrevniKeySourceTarget
 
toString() - Method in class org.apache.crunch.io.avro.trevni.TrevniKeyTarget
 
toString() - Method in class org.apache.crunch.io.hbase.HBaseSourceTarget
 
toString() - Method in class org.apache.crunch.io.hbase.HBaseTarget
 
toString() - Method in class org.apache.crunch.io.hbase.HFileSource
 
toString() - Method in class org.apache.crunch.io.hbase.HFileTarget
 
toString() - Method in class org.apache.crunch.io.impl.FileSourceImpl
 
toString() - Method in class org.apache.crunch.io.impl.FileTargetImpl
 
toString() - Method in class org.apache.crunch.io.parquet.AvroParquetFileSource
 
toString() - Method in class org.apache.crunch.io.parquet.AvroParquetFileSourceTarget
 
toString() - Method in class org.apache.crunch.io.parquet.AvroParquetFileTarget
 
toString() - Method in class org.apache.crunch.io.seq.SeqFileSource
 
toString() - Method in class org.apache.crunch.io.seq.SeqFileSourceTarget
 
toString() - Method in class org.apache.crunch.io.seq.SeqFileTableSource
 
toString() - Method in class org.apache.crunch.io.seq.SeqFileTableSourceTarget
 
toString() - Method in class org.apache.crunch.io.seq.SeqFileTarget
 
toString() - Method in class org.apache.crunch.io.text.NLineFileSource
 
toString() - Method in class org.apache.crunch.io.text.TextFileSource
 
toString() - Method in class org.apache.crunch.io.text.TextFileSourceTarget
 
toString() - Method in class org.apache.crunch.io.text.TextFileTableSource
 
toString() - Method in class org.apache.crunch.io.text.TextFileTableSourceTarget
 
toString() - Method in class org.apache.crunch.io.text.TextFileTarget
 
toString() - Method in class org.apache.crunch.lib.Sort.ColumnOrder
 
toString() - Method in class org.apache.crunch.materialize.pobject.PObjectImpl
toString() - Method in class org.apache.crunch.Pair
 
toString() - Method in class org.apache.crunch.Tuple3
 
toString() - Method in class org.apache.crunch.Tuple4
 
toString() - Method in class org.apache.crunch.TupleN
 
toString() - Method in class org.apache.crunch.types.PGroupedTableType.HoldLastIterator
 
toString() - Method in class org.apache.crunch.types.PGroupedTableType.PTypeIterable
 
toString() - Method in class org.apache.crunch.types.writable.TupleWritable
Convert Tuple to String as in the following.
TotalBytesByIP - Class in org.apache.crunch.examples
 
TotalBytesByIP() - Constructor for class org.apache.crunch.examples.TotalBytesByIP
 
TotalOrderPartitioner<K,V> - Class in org.apache.crunch.lib.sort
A partition-aware Partitioner instance that can work with either Avro or Writable-formatted keys.
TotalOrderPartitioner() - Constructor for class org.apache.crunch.lib.sort.TotalOrderPartitioner
 
TrevniFileReaderFactory<T> - Class in org.apache.crunch.io.avro.trevni
 
TrevniFileReaderFactory(AvroType<T>) - Constructor for class org.apache.crunch.io.avro.trevni.TrevniFileReaderFactory
 
TrevniFileReaderFactory(Schema) - Constructor for class org.apache.crunch.io.avro.trevni.TrevniFileReaderFactory
 
TrevniKeySource<T> - Class in org.apache.crunch.io.avro.trevni
 
TrevniKeySource(Path, AvroType<T>) - Constructor for class org.apache.crunch.io.avro.trevni.TrevniKeySource
 
TrevniKeySource(List<Path>, AvroType<T>) - Constructor for class org.apache.crunch.io.avro.trevni.TrevniKeySource
 
TrevniKeySourceTarget<T> - Class in org.apache.crunch.io.avro.trevni
 
TrevniKeySourceTarget(Path, AvroType<T>) - Constructor for class org.apache.crunch.io.avro.trevni.TrevniKeySourceTarget
 
TrevniKeySourceTarget(Path, AvroType<T>, FileNamingScheme) - Constructor for class org.apache.crunch.io.avro.trevni.TrevniKeySourceTarget
 
TrevniKeyTarget - Class in org.apache.crunch.io.avro.trevni
 
TrevniKeyTarget(String) - Constructor for class org.apache.crunch.io.avro.trevni.TrevniKeyTarget
 
TrevniKeyTarget(Path) - Constructor for class org.apache.crunch.io.avro.trevni.TrevniKeyTarget
 
TrevniKeyTarget(Path, FileNamingScheme) - Constructor for class org.apache.crunch.io.avro.trevni.TrevniKeyTarget
 
TrevniOutputFormat<T> - Class in org.apache.crunch.io.avro.trevni
 
TrevniOutputFormat() - Constructor for class org.apache.crunch.io.avro.trevni.TrevniOutputFormat
 
TrevniReadableData<T> - Class in org.apache.crunch.io.avro.trevni
 
TrevniReadableData(List<Path>, AvroType<T>) - Constructor for class org.apache.crunch.io.avro.trevni.TrevniReadableData
 
TrevniRecordWriter<T> - Class in org.apache.crunch.io.avro.trevni
 
TrevniRecordWriter(TaskAttemptContext) - Constructor for class org.apache.crunch.io.avro.trevni.TrevniRecordWriter
 
tripAggregator(Aggregator<V1>, Aggregator<V2>, Aggregator<V3>) - Static method in class org.apache.crunch.fn.Aggregators
Apply separate aggregators to each component of a Tuple3.
triples(PType<V1>, PType<V2>, PType<V3>) - Static method in class org.apache.crunch.types.avro.Avros
 
triples(PType<V1>, PType<V2>, PType<V3>) - Method in class org.apache.crunch.types.avro.AvroTypeFamily
 
triples(PType<V1>, PType<V2>, PType<V3>) - Method in interface org.apache.crunch.types.PTypeFamily
 
triples(PType<V1>, PType<V2>, PType<V3>) - Static method in class org.apache.crunch.types.writable.Writables
 
triples(PType<V1>, PType<V2>, PType<V3>) - Method in class org.apache.crunch.types.writable.WritableTypeFamily
 
Tuple - Interface in org.apache.crunch
A fixed-size collection of Objects, used in Crunch for representing joins between PCollections.
tuple2PairFunc() - Static method in class org.apache.crunch.impl.spark.GuavaUtils
 
Tuple3<V1,V2,V3> - Class in org.apache.crunch
A convenience class for three-element Tuples.
Tuple3(V1, V2, V3) - Constructor for class org.apache.crunch.Tuple3
 
TUPLE3 - Static variable in class org.apache.crunch.types.TupleFactory
 
Tuple3.Collect<V1,V2,V3> - Class in org.apache.crunch
 
Tuple3.Collect(Collection<V1>, Collection<V2>, Collection<V3>) - Constructor for class org.apache.crunch.Tuple3.Collect
 
Tuple4<V1,V2,V3,V4> - Class in org.apache.crunch
A convenience class for four-element Tuples.
Tuple4(V1, V2, V3, V4) - Constructor for class org.apache.crunch.Tuple4
 
TUPLE4 - Static variable in class org.apache.crunch.types.TupleFactory
 
Tuple4.Collect<V1,V2,V3,V4> - Class in org.apache.crunch
 
Tuple4.Collect(Collection<V1>, Collection<V2>, Collection<V3>, Collection<V4>) - Constructor for class org.apache.crunch.Tuple4.Collect
 
tupleAggregator(Aggregator<?>...) - Static method in class org.apache.crunch.fn.Aggregators
Apply separate aggregators to each component of a Tuple.
TupleDeepCopier<T extends Tuple> - Class in org.apache.crunch.types
Performs deep copies (based on underlying PType deep copying) of Tuple-based objects.
TupleDeepCopier(Class<T>, PType...) - Constructor for class org.apache.crunch.types.TupleDeepCopier
 
TupleFactory<T extends Tuple> - Class in org.apache.crunch.types
 
TupleFactory() - Constructor for class org.apache.crunch.types.TupleFactory
 
TupleN - Class in org.apache.crunch
A Tuple instance for an arbitrary number of values.
TupleN(Object...) - Constructor for class org.apache.crunch.TupleN
 
TUPLEN - Static variable in class org.apache.crunch.types.TupleFactory
 
tuples(PType...) - Static method in class org.apache.crunch.types.avro.Avros
 
tuples(Class<T>, PType...) - Static method in class org.apache.crunch.types.avro.Avros
 
tuples(PType<?>...) - Method in class org.apache.crunch.types.avro.AvroTypeFamily
 
tuples(Class<T>, PType<?>...) - Method in class org.apache.crunch.types.avro.AvroTypeFamily
 
tuples(PType<?>...) - Method in interface org.apache.crunch.types.PTypeFamily
 
tuples(Class<T>, PType<?>...) - Method in interface org.apache.crunch.types.PTypeFamily
 
tuples(PType...) - Static method in class org.apache.crunch.types.writable.Writables
 
tuples(Class<T>, PType...) - Static method in class org.apache.crunch.types.writable.Writables
 
tuples(PType<?>...) - Method in class org.apache.crunch.types.writable.WritableTypeFamily
 
tuples(Class<T>, PType<?>...) - Method in class org.apache.crunch.types.writable.WritableTypeFamily
 
Tuples - Class in org.apache.crunch.util
Utilities for working with subclasses of the Tuple interface.
Tuples() - Constructor for class org.apache.crunch.util.Tuples
 
Tuples.PairIterable<S,T> - Class in org.apache.crunch.util
 
Tuples.PairIterable(Iterable<S>, Iterable<T>) - Constructor for class org.apache.crunch.util.Tuples.PairIterable
 
Tuples.QuadIterable<A,B,C,D> - Class in org.apache.crunch.util
 
Tuples.QuadIterable(Iterable<A>, Iterable<B>, Iterable<C>, Iterable<D>) - Constructor for class org.apache.crunch.util.Tuples.QuadIterable
 
Tuples.TripIterable<A,B,C> - Class in org.apache.crunch.util
 
Tuples.TripIterable(Iterable<A>, Iterable<B>, Iterable<C>) - Constructor for class org.apache.crunch.util.Tuples.TripIterable
 
Tuples.TupleNIterable - Class in org.apache.crunch.util
 
Tuples.TupleNIterable(Iterable<?>...) - Constructor for class org.apache.crunch.util.Tuples.TupleNIterable
 
TupleWritable - Class in org.apache.crunch.types.writable
A straight copy of the TupleWritable implementation in the join package, added here because of its package visibility restrictions.
TupleWritable() - Constructor for class org.apache.crunch.types.writable.TupleWritable
Create an empty tuple with no allocated storage for writables.
TupleWritable(BytesWritable[]) - Constructor for class org.apache.crunch.types.writable.TupleWritable
Initialize tuple with storage; unknown whether any of them contain "written" values.
TupleWritableComparator - Class in org.apache.crunch.lib.sort
 
TupleWritableComparator() - Constructor for class org.apache.crunch.lib.sort.TupleWritableComparator
 
type - Variable in class org.apache.crunch.impl.dist.collect.BaseDoTable
 
typedCollectionOf(PType<T>, T...) - Static method in class org.apache.crunch.impl.mem.MemPipeline
 
typedCollectionOf(PType<T>, Iterable<T>) - Static method in class org.apache.crunch.impl.mem.MemPipeline
 
typedTableOf(PTableType<S, T>, S, T, Object...) - Static method in class org.apache.crunch.impl.mem.MemPipeline
 
typedTableOf(PTableType<S, T>, Iterable<Pair<S, T>>) - Static method in class org.apache.crunch.impl.mem.MemPipeline
 

U

ungroup() - Method in class org.apache.crunch.impl.dist.collect.BaseGroupedTable
 
ungroup() - Method in interface org.apache.crunch.PGroupedTable
Convert this grouping back into a multimap.
union(PCollection<S>) - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
union(PCollection<S>...) - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
union(PTable<K, V>) - Method in class org.apache.crunch.impl.dist.collect.PTableBase
 
union(PTable<K, V>...) - Method in class org.apache.crunch.impl.dist.collect.PTableBase
 
union(PCollection<S>) - Method in class org.apache.crunch.impl.mem.collect.MemCollection
 
union(PCollection<S>...) - Method in class org.apache.crunch.impl.mem.collect.MemCollection
 
union(PTable<K, V>) - Method in class org.apache.crunch.impl.mem.collect.MemTable
 
union(PTable<K, V>...) - Method in class org.apache.crunch.impl.mem.collect.MemTable
 
union(PCollection<S>) - Method in interface org.apache.crunch.PCollection
Returns a PCollection instance that acts as the union of this PCollection and the given PCollection.
union(PCollection<S>...) - Method in interface org.apache.crunch.PCollection
Returns a PCollection instance that acts as the union of this PCollection and the input PCollections.
union(PTable<K, V>) - Method in interface org.apache.crunch.PTable
Returns a PTable instance that acts as the union of this PTable and the other PTables.
union(PTable<K, V>...) - Method in interface org.apache.crunch.PTable
Returns a PTable instance that acts as the union of this PTable and the input PTables.
UnionCollection<S> - Class in org.apache.crunch.impl.mr.collect
 
UnionCollection<S> - Class in org.apache.crunch.impl.spark.collect
 
UnionReadableData<T> - Class in org.apache.crunch.util
 
UnionReadableData(List<ReadableData<T>>) - Constructor for class org.apache.crunch.util.UnionReadableData
 
UnionTable<K,V> - Class in org.apache.crunch.impl.mr.collect
 
UnionTable<K,V> - Class in org.apache.crunch.impl.spark.collect
 
UNIQUE_ELEMENTS() - Static method in class org.apache.crunch.fn.Aggregators
Collect the unique elements of the input, as defined by the equals method for the input objects.
update(T) - Method in interface org.apache.crunch.Aggregator
Incorporate the given value into the aggregate state maintained by this instance.
useDisk(boolean) - Method in class org.apache.crunch.CachingOptions.Builder
 
useDisk() - Method in class org.apache.crunch.CachingOptions
Whether the framework may cache data on disk.
useMemory(boolean) - Method in class org.apache.crunch.CachingOptions.Builder
 
useMemory() - Method in class org.apache.crunch.CachingOptions
Whether the framework may cache data in memory without writing it to disk.
UTF8_TO_STRING - Static variable in class org.apache.crunch.types.avro.Avros
 
uuid(PTypeFamily) - Static method in class org.apache.crunch.types.PTypes
 

V

value - Variable in class org.apache.crunch.impl.spark.ByteArray
 
valueOf(String) - Static method in enum org.apache.crunch.impl.mr.MRJob.State
Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum org.apache.crunch.impl.mr.run.NodeContext
Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum org.apache.crunch.lib.join.JoinType
Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum org.apache.crunch.lib.Sort.Order
Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum org.apache.crunch.PipelineExecution.Status
Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum org.apache.crunch.Target.WriteMode
Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum org.apache.crunch.types.avro.AvroMode
Returns the enum constant of this type with the specified name.
values() - Method in class org.apache.crunch.impl.dist.collect.PTableBase
 
values() - Method in class org.apache.crunch.impl.mem.collect.MemTable
 
values() - Static method in enum org.apache.crunch.impl.mr.MRJob.State
Returns an array containing the constants of this enum type, in the order they are declared.
values() - Static method in enum org.apache.crunch.impl.mr.run.NodeContext
Returns an array containing the constants of this enum type, in the order they are declared.
values() - Static method in enum org.apache.crunch.lib.join.JoinType
Returns an array containing the constants of this enum type, in the order they are declared.
values(PTable<K, V>) - Static method in class org.apache.crunch.lib.PTables
Extract the values from the given PTable<K, V> as a PCollection<V>.
values() - Static method in enum org.apache.crunch.lib.Sort.Order
Returns an array containing the constants of this enum type, in the order they are declared.
values() - Static method in enum org.apache.crunch.PipelineExecution.Status
Returns an array containing the constants of this enum type, in the order they are declared.
values() - Method in interface org.apache.crunch.PTable
Returns a PCollection made up of the values in this PTable.
values() - Static method in enum org.apache.crunch.Target.WriteMode
Returns an array containing the constants of this enum type, in the order they are declared.
values() - Static method in enum org.apache.crunch.types.avro.AvroMode
Returns an array containing the constants of this enum type, in the order they are declared.
visitDoCollection(BaseDoCollection<?>) - Method in interface org.apache.crunch.impl.dist.collect.PCollectionImpl.Visitor
 
visitDoTable(BaseDoTable<?, ?>) - Method in interface org.apache.crunch.impl.dist.collect.PCollectionImpl.Visitor
 
visitGroupedTable(BaseGroupedTable<?, ?>) - Method in interface org.apache.crunch.impl.dist.collect.PCollectionImpl.Visitor
 
visitInputCollection(BaseInputCollection<?>) - Method in interface org.apache.crunch.impl.dist.collect.PCollectionImpl.Visitor
 
visitUnionCollection(BaseUnionCollection<?>) - Method in interface org.apache.crunch.impl.dist.collect.PCollectionImpl.Visitor
 

W

waitFor(long, TimeUnit) - Method in class org.apache.crunch.impl.mr.exec.MRExecutor
 
waitFor(long, TimeUnit) - Method in class org.apache.crunch.impl.spark.SparkRuntime
 
waitFor(long, TimeUnit) - Method in interface org.apache.crunch.PipelineExecution
Blocks until pipeline completes or the specified waiting time elapsed.
waitUntilDone() - Method in class org.apache.crunch.impl.mr.exec.MRExecutor
 
waitUntilDone() - Method in class org.apache.crunch.impl.spark.SparkRuntime
 
waitUntilDone() - Method in interface org.apache.crunch.PipelineExecution
Blocks until pipeline completes, i.e.
wasLogged() - Method in exception org.apache.crunch.CrunchRuntimeException
Returns true if this exception was written to the debug logs.
weightedReservoirSample(PCollection<Pair<T, N>>, int) - Static method in class org.apache.crunch.lib.Sample
Selects a weighted sample of the elements of the given PCollection, where the second term in the input Pair is a numerical weight.
weightedReservoirSample(PCollection<Pair<T, N>>, int, Long) - Static method in class org.apache.crunch.lib.Sample
The weighted reservoir sampling function with the seed term exposed for testing purposes.
WordAggregationHBase - Class in org.apache.crunch.examples
You need to have a HBase instance running.
WordAggregationHBase() - Constructor for class org.apache.crunch.examples.WordAggregationHBase
 
WordCount - Class in org.apache.crunch.examples
 
WordCount() - Constructor for class org.apache.crunch.examples.WordCount
 
WritableDeepCopier<T extends org.apache.hadoop.io.Writable> - Class in org.apache.crunch.types.writable
Performs deep copies of Writable values.
WritableDeepCopier(Class<T>) - Constructor for class org.apache.crunch.types.writable.WritableDeepCopier
 
writables(Class<T>) - Static method in class org.apache.crunch.types.avro.Avros
 
Writables - Class in org.apache.crunch.types.writable
Defines static methods that are analogous to the methods defined in WritableTypeFamily for convenient static importing.
writables(Class<W>) - Static method in class org.apache.crunch.types.writable.Writables
 
writables(Class<W>) - Method in class org.apache.crunch.types.writable.WritableTypeFamily
 
WritableSerDe - Class in org.apache.crunch.impl.spark.serde
 
WritableSerDe(Class<? extends Writable>) - Constructor for class org.apache.crunch.impl.spark.serde.WritableSerDe
 
WritableType<T,W extends org.apache.hadoop.io.Writable> - Class in org.apache.crunch.types.writable
 
WritableType(Class<T>, Class<W>, MapFn<W, T>, MapFn<T, W>, PType...) - Constructor for class org.apache.crunch.types.writable.WritableType
 
WritableTypeFamily - Class in org.apache.crunch.types.writable
The Writable-based implementation of the PTypeFamily interface.
write(DataOutput) - Method in class org.apache.crunch.contrib.io.jdbc.IdentifiableName
 
write(PreparedStatement) - Method in class org.apache.crunch.contrib.io.jdbc.IdentifiableName
 
write(Target) - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
write(Target, Target.WriteMode) - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
 
write(Target) - Method in class org.apache.crunch.impl.dist.collect.PTableBase
 
write(Target, Target.WriteMode) - Method in class org.apache.crunch.impl.dist.collect.PTableBase
 
write(PCollection<?>, Target) - Method in class org.apache.crunch.impl.dist.DistributedPipeline
 
write(PCollection<?>, Target, Target.WriteMode) - Method in class org.apache.crunch.impl.dist.DistributedPipeline
 
write(Target) - Method in class org.apache.crunch.impl.mem.collect.MemCollection
 
write(Target, Target.WriteMode) - Method in class org.apache.crunch.impl.mem.collect.MemCollection
 
write(Target) - Method in class org.apache.crunch.impl.mem.collect.MemTable
 
write(Target, Target.WriteMode) - Method in class org.apache.crunch.impl.mem.collect.MemTable
 
write(PCollection<?>, Target) - Method in class org.apache.crunch.impl.mem.MemPipeline
 
write(PCollection<?>, Target, Target.WriteMode) - Method in class org.apache.crunch.impl.mem.MemPipeline
 
write(AvroKey<T>, NullWritable) - Method in class org.apache.crunch.io.avro.trevni.TrevniRecordWriter
write(String, K, V) - Method in class org.apache.crunch.io.CrunchOutputs
 
write(DataOutput) - Method in class org.apache.crunch.io.FormatBundle
 
write(Target) - Method in interface org.apache.crunch.PCollection
Write the contents of this PCollection to the given Target, using the storage format specified by the target.
write(Target, Target.WriteMode) - Method in interface org.apache.crunch.PCollection
Write the contents of this PCollection to the given Target, using the given Target.WriteMode to handle existing targets.
write(PCollection<?>, Target) - Method in interface org.apache.crunch.Pipeline
Write the given collection to the given target on the next pipeline run.
write(PCollection<?>, Target, Target.WriteMode) - Method in interface org.apache.crunch.Pipeline
Write the contents of the PCollection to the given Target, using the storage format specified by the target and the given WriteMode for cases where the referenced Target already exists.
write(Target) - Method in interface org.apache.crunch.PTable
Writes this PTable to the given Target.
write(Target, Target.WriteMode) - Method in interface org.apache.crunch.PTable
Writes this PTable to the given Target, using the given Target.WriteMode to handle existing targets.
write(DataOutput) - Method in class org.apache.crunch.types.writable.TupleWritable
Writes each Writable to out.
write(PCollection<?>, Target) - Method in class org.apache.crunch.util.CrunchTool
 
write(Configuration, Path, Object) - Static method in class org.apache.crunch.util.DistCache
 
writePutsToHFilesForIncrementalLoad(PCollection<Put>, HTable, Path) - Static method in class org.apache.crunch.io.hbase.HFileUtils
 
writer - Variable in class org.apache.crunch.io.avro.trevni.TrevniRecordWriter
Trevni file writer
writeTextFile(PCollection<T>, String) - Method in class org.apache.crunch.impl.dist.DistributedPipeline
 
writeTextFile(PCollection<T>, String) - Method in class org.apache.crunch.impl.mem.MemPipeline
 
writeTextFile(PCollection<T>, String) - Method in interface org.apache.crunch.Pipeline
A convenience method for writing a text file.
writeTextFile(PCollection<?>, String) - Method in class org.apache.crunch.util.CrunchTool
 
writeToHFilesForIncrementalLoad(PCollection<KeyValue>, HTable, Path) - Static method in class org.apache.crunch.io.hbase.HFileUtils
 

X

xboolean() - Static method in class org.apache.crunch.contrib.text.Extractors
Returns an Extractor for booleans.
xboolean(Boolean) - Static method in class org.apache.crunch.contrib.text.Extractors
 
xcollect(TokenizerFactory, Extractor<T>) - Static method in class org.apache.crunch.contrib.text.Extractors
 
xcustom(Class<T>, TokenizerFactory, Extractor...) - Static method in class org.apache.crunch.contrib.text.Extractors
Returns an Extractor for a subclass of Tuple with a constructor that has the given extractor types that uses the given TokenizerFactory for parsing the sub-fields.
xdouble() - Static method in class org.apache.crunch.contrib.text.Extractors
Returns an Extractor for doubles.
xdouble(Double) - Static method in class org.apache.crunch.contrib.text.Extractors
 
xfloat() - Static method in class org.apache.crunch.contrib.text.Extractors
Returns an Extractor for floats.
xfloat(Float) - Static method in class org.apache.crunch.contrib.text.Extractors
 
xint() - Static method in class org.apache.crunch.contrib.text.Extractors
Returns an Extractor for integers.
xint(Integer) - Static method in class org.apache.crunch.contrib.text.Extractors
Returns an Extractor for integers.
xlong() - Static method in class org.apache.crunch.contrib.text.Extractors
Returns an Extractor for longs.
xlong(Long) - Static method in class org.apache.crunch.contrib.text.Extractors
Returns an Extractor for longs.
xpair(TokenizerFactory, Extractor<K>, Extractor<V>) - Static method in class org.apache.crunch.contrib.text.Extractors
Returns an Extractor for pairs of the given types that uses the given TokenizerFactory for parsing the sub-fields.
xquad(TokenizerFactory, Extractor<A>, Extractor<B>, Extractor<C>, Extractor<D>) - Static method in class org.apache.crunch.contrib.text.Extractors
Returns an Extractor for quads of the given types that uses the given TokenizerFactory for parsing the sub-fields.
xstring() - Static method in class org.apache.crunch.contrib.text.Extractors
Returns an Extractor for strings.
xstring(String) - Static method in class org.apache.crunch.contrib.text.Extractors
 
xtriple(TokenizerFactory, Extractor<A>, Extractor<B>, Extractor<C>) - Static method in class org.apache.crunch.contrib.text.Extractors
Returns an Extractor for triples of the given types that uses the given TokenizerFactory for parsing the sub-fields.
xtupleN(TokenizerFactory, Extractor...) - Static method in class org.apache.crunch.contrib.text.Extractors
Returns an Extractor for an arbitrary number of types that uses the given TokenizerFactory for parsing the sub-fields.

Z

zero(Map<String, Long>) - Method in class org.apache.crunch.impl.spark.CounterAccumulatorParam
 

A B C D E F G H I J K L M N O P Q R S T U V W X Z

Copyright © 2014 The Apache Software Foundation. All Rights Reserved.