This project has retired. For details please refer to its Attic page.
Index (Apache Crunch 0.8.0 API)
A B C D E F G H I J K L M N O P Q R S T U V W X

A

AbstractCompositeExtractor<T> - Class in org.apache.crunch.contrib.text
Base class for Extractor instances that delegates the parsing of fields to other Extractor instances, primarily used for constructing composite records that implement the Tuple interface.
AbstractCompositeExtractor(TokenizerFactory, List<Extractor<?>>) - Constructor for class org.apache.crunch.contrib.text.AbstractCompositeExtractor
 
AbstractSimpleExtractor<T> - Class in org.apache.crunch.contrib.text
Base class for the common case Extractor instances that construct a single object from a block of text stored in a String, with support for error handling and reporting.
accept(T) - Method in class org.apache.crunch.FilterFn
If true, emit the given record.
accept(OutputHandler, PType<?>) - Method in interface org.apache.crunch.Target
Checks to see if this Target instance is compatible with the given PType.
ACCEPT_ALL() - Static method in class org.apache.crunch.fn.FilterFns
Accept everything.
addCacheFile(Path, Configuration) - Static method in class org.apache.crunch.util.DistCache
 
addInputPath(Job, Path, FormatBundle, int) - Static method in class org.apache.crunch.io.CrunchInputs
 
addJarDirToDistributedCache(Configuration, File) - Static method in class org.apache.crunch.util.DistCache
Adds all jars under the specified directory to the distributed cache of jobs using the provided configuration.
addJarDirToDistributedCache(Configuration, String) - Static method in class org.apache.crunch.util.DistCache
Adds all jars under the directory at the specified path to the distributed cache of jobs using the provided configuration.
addJarToDistributedCache(Configuration, File) - Static method in class org.apache.crunch.util.DistCache
Adds the specified jar to the distributed cache of jobs using the provided configuration.
addJarToDistributedCache(Configuration, String) - Static method in class org.apache.crunch.util.DistCache
Adds the jar at the specified path to the distributed cache of jobs using the provided configuration.
addNamedOutput(Job, String, Class<? extends OutputFormat>, Class, Class) - Static method in class org.apache.crunch.io.CrunchOutputs
 
addNamedOutput(Job, String, FormatBundle<? extends OutputFormat>, Class, Class) - Static method in class org.apache.crunch.io.CrunchOutputs
 
Aggregate - Class in org.apache.crunch.lib
Methods for performing various types of aggregations over PCollection instances.
Aggregate() - Constructor for class org.apache.crunch.lib.Aggregate
 
Aggregate.PairValueComparator<K,V> - Class in org.apache.crunch.lib
 
Aggregate.PairValueComparator(boolean) - Constructor for class org.apache.crunch.lib.Aggregate.PairValueComparator
 
Aggregate.TopKCombineFn<K,V> - Class in org.apache.crunch.lib
 
Aggregate.TopKCombineFn(int, boolean) - Constructor for class org.apache.crunch.lib.Aggregate.TopKCombineFn
 
Aggregate.TopKFn<K,V> - Class in org.apache.crunch.lib
 
Aggregate.TopKFn(int, boolean) - Constructor for class org.apache.crunch.lib.Aggregate.TopKFn
 
Aggregator<T> - Interface in org.apache.crunch
Aggregate a sequence of values into a possibly smaller sequence of the same type.
Aggregators - Class in org.apache.crunch.fn
A collection of pre-defined Aggregators.
Aggregators.SimpleAggregator<T> - Class in org.apache.crunch.fn
Base class for aggregators that do not require any initialization.
Aggregators.SimpleAggregator() - Constructor for class org.apache.crunch.fn.Aggregators.SimpleAggregator
 
and(FilterFn<S>, FilterFn<S>) - Static method in class org.apache.crunch.fn.FilterFns
Accept an entry if all of the given filters accept it, using short-circuit evaluation.
and(FilterFn<S>...) - Static method in class org.apache.crunch.fn.FilterFns
Accept an entry if all of the given filters accept it, using short-circuit evaluation.
apply(Statement, Description) - Method in class org.apache.crunch.test.TemporaryPath
 
as(PType<T>) - Method in class org.apache.crunch.types.avro.AvroTypeFamily
 
as(PType<T>) - Method in interface org.apache.crunch.types.PTypeFamily
Returns the equivalent of the given ptype for this family, if it exists.
as(PType<T>) - Method in class org.apache.crunch.types.writable.WritableTypeFamily
 
asCollection() - Method in interface org.apache.crunch.PCollection
 
asMap() - Method in interface org.apache.crunch.PTable
Returns a PObject encapsulating a Map made up of the keys and values in this PTable.
asPTable(PCollection<Pair<K, V>>) - Static method in class org.apache.crunch.lib.PTables
Convert the given PCollection<Pair<K, V>> to a PTable<K, V>.
asReadable() - Method in interface org.apache.crunch.io.ReadableSource
 
asReadable(boolean) - Method in interface org.apache.crunch.PCollection
 
asSourceTarget(PType<T>) - Method in interface org.apache.crunch.Target
Attempt to create the SourceTarget type that corresponds to this Target for the given PType, if possible.
At - Class in org.apache.crunch.io
Static factory methods for creating common SourceTarget types, which may be treated as both a Source and a Target.
At() - Constructor for class org.apache.crunch.io.At
 
AverageBytesByIP - Class in org.apache.crunch.examples
 
AverageBytesByIP() - Constructor for class org.apache.crunch.examples.AverageBytesByIP
 
avroFile(String, Class<T>) - Static method in class org.apache.crunch.io.At
Creates a SourceTarget<T> instance from the Avro file(s) at the given path name.
avroFile(Path, Class<T>) - Static method in class org.apache.crunch.io.At
Creates a SourceTarget<T> instance from the Avro file(s) at the given Path.
avroFile(String, AvroType<T>) - Static method in class org.apache.crunch.io.At
Creates a SourceTarget<T> instance from the Avro file(s) at the given path name.
avroFile(Path, AvroType<T>) - Static method in class org.apache.crunch.io.At
Creates a SourceTarget<T> instance from the Avro file(s) at the given Path.
avroFile(String, Class<T>) - Static method in class org.apache.crunch.io.From
Creates a Source<T> instance from the Avro file(s) at the given path name.
avroFile(Path, Class<T>) - Static method in class org.apache.crunch.io.From
Creates a Source<T> instance from the Avro file(s) at the given Path.
avroFile(String, AvroType<T>) - Static method in class org.apache.crunch.io.From
Creates a Source<T> instance from the Avro file(s) at the given path name.
avroFile(Path, AvroType<T>) - Static method in class org.apache.crunch.io.From
Creates a Source<T> instance from the Avro file(s) at the given Path.
avroFile(String) - Static method in class org.apache.crunch.io.To
Creates a Target at the given path name that writes data to Avro files.
avroFile(Path) - Static method in class org.apache.crunch.io.To
Creates a Target at the given Path that writes data to Avro files.
AvroInputFormat<T> - Class in org.apache.crunch.types.avro
An InputFormat for Avro data files.
AvroInputFormat() - Constructor for class org.apache.crunch.types.avro.AvroInputFormat
 
AvroOutputFormat<T> - Class in org.apache.crunch.types.avro
An OutputFormat for Avro data files.
AvroOutputFormat() - Constructor for class org.apache.crunch.types.avro.AvroOutputFormat
 
Avros - Class in org.apache.crunch.types.avro
Defines static methods that are analogous to the methods defined in AvroTypeFamily for convenient static importing.
AvroTextOutputFormat<K,V> - Class in org.apache.crunch.types.avro
 
AvroTextOutputFormat() - Constructor for class org.apache.crunch.types.avro.AvroTextOutputFormat
 
AvroType<T> - Class in org.apache.crunch.types.avro
The implementation of the PType interface for Avro-based serialization.
AvroType(Class<T>, Schema, DeepCopier<T>, PType...) - Constructor for class org.apache.crunch.types.avro.AvroType
 
AvroType(Class<T>, Schema, MapFn, MapFn, DeepCopier<T>, PType...) - Constructor for class org.apache.crunch.types.avro.AvroType
 
AvroTypeFamily - Class in org.apache.crunch.types.avro
 
AvroUtf8InputFormat - Class in org.apache.crunch.types.avro
An InputFormat for text files.
AvroUtf8InputFormat() - Constructor for class org.apache.crunch.types.avro.AvroUtf8InputFormat
 

B

bigInt(PTypeFamily) - Static method in class org.apache.crunch.types.PTypes
 
BIGINT_TO_BYTE - Static variable in class org.apache.crunch.types.PTypes
 
BloomFilterFactory - Class in org.apache.crunch.contrib.bloomfilter
Factory Class for creating BloomFilters.
BloomFilterFactory() - Constructor for class org.apache.crunch.contrib.bloomfilter.BloomFilterFactory
 
BloomFilterFn<S> - Class in org.apache.crunch.contrib.bloomfilter
The class is responsible for generating keys that are used in a BloomFilter
BloomFilterFn() - Constructor for class org.apache.crunch.contrib.bloomfilter.BloomFilterFn
 
BloomFilterJoinStrategy<K,U,V> - Class in org.apache.crunch.lib.join
Join strategy that uses a Bloom filter that is trained on the keys of the left-side table to filter the key/value pairs of the right-side table before sending through the shuffle and reduce phase.
BloomFilterJoinStrategy(int) - Constructor for class org.apache.crunch.lib.join.BloomFilterJoinStrategy
Instantiate with the expected number of unique keys in the left table.
BloomFilterJoinStrategy(int, float) - Constructor for class org.apache.crunch.lib.join.BloomFilterJoinStrategy
Instantiate with the expected number of unique keys in the left table, and the acceptable false positive rate for the Bloom filter.
BloomFilterJoinStrategy(int, float, JoinStrategy<K, U, V>) - Constructor for class org.apache.crunch.lib.join.BloomFilterJoinStrategy
Instantiate with the expected number of unique keys in the left table, and the acceptable false positive rate for the Bloom filter, and an underlying join strategy to delegate to.
booleans() - Static method in class org.apache.crunch.types.avro.Avros
 
booleans() - Method in class org.apache.crunch.types.avro.AvroTypeFamily
 
booleans() - Method in interface org.apache.crunch.types.PTypeFamily
 
booleans() - Static method in class org.apache.crunch.types.writable.Writables
 
booleans() - Method in class org.apache.crunch.types.writable.WritableTypeFamily
 
bottom(int) - Method in interface org.apache.crunch.PTable
Returns a PTable made up of the pairs in this PTable with the smallest value field.
build() - Method in class org.apache.crunch.contrib.text.TokenizerFactory.Builder
Returns a new TokenizerFactory with settings determined by this Builder instance.
build() - Method in class org.apache.crunch.GroupingOptions.Builder
 
build() - Method in class org.apache.crunch.ParallelDoOptions.Builder
 
builder() - Static method in class org.apache.crunch.contrib.text.TokenizerFactory
Factory method for creating a TokenizerFactory.Builder instance.
builder() - Static method in class org.apache.crunch.GroupingOptions
 
builder() - Static method in class org.apache.crunch.ParallelDoOptions
 
by(int, Sort.Order) - Static method in class org.apache.crunch.lib.Sort.ColumnOrder
 
by(MapFn<S, K>, PType<K>) - Method in interface org.apache.crunch.PCollection
Apply the given map function to each element of this instance in order to create a PTable.
by(String, MapFn<S, K>, PType<K>) - Method in interface org.apache.crunch.PCollection
Apply the given map function to each element of this instance in order to create a PTable.
BYTE_TO_BIGINT - Static variable in class org.apache.crunch.types.PTypes
 
bytes() - Static method in class org.apache.crunch.types.avro.Avros
 
bytes() - Method in class org.apache.crunch.types.avro.AvroTypeFamily
 
bytes() - Method in interface org.apache.crunch.types.PTypeFamily
 
bytes() - Static method in class org.apache.crunch.types.writable.Writables
 
bytes() - Method in class org.apache.crunch.types.writable.WritableTypeFamily
 
BYTES_IN - Static variable in class org.apache.crunch.types.avro.Avros
 
BYTES_PER_REDUCE_TASK - Static variable in class org.apache.crunch.util.PartitionUtils
 

C

CAN_COMBINE_SPECIFIC_AND_REFLECT_SCHEMAS - Static variable in class org.apache.crunch.types.avro.Avros
Older versions of Avro (i.e., before 1.7.0) do not support schemas that are composed of a mix of specific and reflection-based schemas.
Cartesian - Class in org.apache.crunch.lib
Utilities for Cartesian products of two PTable or PCollection instances.
Cartesian() - Constructor for class org.apache.crunch.lib.Cartesian
 
Channels - Class in org.apache.crunch.lib
Utilities for splitting Pair instances emitted by DoFn into separate PCollection instances.
Channels() - Constructor for class org.apache.crunch.lib.Channels
 
checkCombiningSpecificAndReflectionSchemas() - Static method in class org.apache.crunch.types.avro.Avros
 
cleanup(Emitter<Pair<String, BloomFilter>>) - Method in class org.apache.crunch.contrib.bloomfilter.BloomFilterFn
 
cleanup(Emitter<T>) - Method in class org.apache.crunch.DoFn
Called during the cleanup of the MapReduce job this DoFn is associated with.
cleanup(Emitter<T>) - Method in class org.apache.crunch.FilterFn
 
cleanup() - Method in class org.apache.crunch.FilterFn
Called during the cleanup of the MapReduce job this FilterFn is associated with.
cleanup(Emitter<T>) - Method in class org.apache.crunch.fn.CompositeMapFn
 
cleanup(Emitter<Pair<S, T>>) - Method in class org.apache.crunch.fn.PairMapFn
 
cleanup(Emitter<Pair<Integer, Pair<K, V>>>) - Method in class org.apache.crunch.lib.Aggregate.TopKFn
 
cleanup(Emitter<Pair<K, Pair<U, V>>>) - Method in class org.apache.crunch.lib.join.FullOuterJoinFn
Called during the cleanup of the MapReduce job this DoFn is associated with.
cleanup(Emitter<Pair<K, Pair<U, V>>>) - Method in class org.apache.crunch.lib.join.LeftOuterJoinFn
Called during the cleanup of the MapReduce job this DoFn is associated with.
clearCounters() - Static method in class org.apache.crunch.impl.mem.MemPipeline
 
clearCounters() - Static method in class org.apache.crunch.test.TestCounters
 
clearWritten(int) - Method in class org.apache.crunch.types.writable.TupleWritable
Record that the tuple does not contain an element at the position provided.
clearWritten() - Method in class org.apache.crunch.types.writable.TupleWritable
Clear any record of which writables have been written to, without releasing storage.
close() - Method in class org.apache.crunch.io.CrunchOutputs
 
Cogroup - Class in org.apache.crunch.lib
 
Cogroup() - Constructor for class org.apache.crunch.lib.Cogroup
 
cogroup(PTable<K, U>, PTable<K, V>) - Static method in class org.apache.crunch.lib.Cogroup
Co-groups the two PTable arguments.
cogroup(int, PTable<K, U>, PTable<K, V>) - Static method in class org.apache.crunch.lib.Cogroup
Co-groups the two PTable arguments with a user-specified degree of parallelism (a.k.a, number of reducers.)
cogroup(PTable<K, V1>, PTable<K, V2>, PTable<K, V3>) - Static method in class org.apache.crunch.lib.Cogroup
Co-groups the three PTable arguments.
cogroup(int, PTable<K, V1>, PTable<K, V2>, PTable<K, V3>) - Static method in class org.apache.crunch.lib.Cogroup
Co-groups the three PTable arguments with a user-specified degree of parallelism (a.k.a, number of reducers.)
cogroup(PTable<K, V1>, PTable<K, V2>, PTable<K, V3>, PTable<K, V4>) - Static method in class org.apache.crunch.lib.Cogroup
Co-groups the three PTable arguments.
cogroup(int, PTable<K, V1>, PTable<K, V2>, PTable<K, V3>, PTable<K, V4>) - Static method in class org.apache.crunch.lib.Cogroup
Co-groups the three PTable arguments with a user-specified degree of parallelism (a.k.a, number of reducers.)
cogroup(PTable<K, ?>, PTable<K, ?>...) - Static method in class org.apache.crunch.lib.Cogroup
Co-groups an arbitrary number of PTable arguments.
cogroup(int, PTable<K, ?>, PTable<K, ?>...) - Static method in class org.apache.crunch.lib.Cogroup
Co-groups an arbitrary number of PTable arguments with a user-specified degree of parallelism (a.k.a, number of reducers.) The largest table should come last in the ordering.
cogroup(PTable<K, U>) - Method in interface org.apache.crunch.PTable
Co-group operation with the given table on common keys.
CollectionDeepCopier<T> - Class in org.apache.crunch.types
Performs deep copies (based on underlying PType deep copying) of Collections.
CollectionDeepCopier(PType<T>) - Constructor for class org.apache.crunch.types.CollectionDeepCopier
 
collectionOf(T...) - Static method in class org.apache.crunch.impl.mem.MemPipeline
 
collectionOf(Iterable<T>) - Static method in class org.apache.crunch.impl.mem.MemPipeline
 
collections(PType<T>) - Static method in class org.apache.crunch.types.avro.Avros
 
collections(PType<T>) - Method in class org.apache.crunch.types.avro.AvroTypeFamily
 
collections(PType<T>) - Method in interface org.apache.crunch.types.PTypeFamily
 
collections(PType<T>) - Static method in class org.apache.crunch.types.writable.Writables
 
collections(PType<T>) - Method in class org.apache.crunch.types.writable.WritableTypeFamily
 
collectValues(PTable<K, V>) - Static method in class org.apache.crunch.lib.Aggregate
 
collectValues() - Method in interface org.apache.crunch.PTable
Aggregate all of the values with the same key into a single key-value pair in the returned PTable.
column() - Method in class org.apache.crunch.lib.Sort.ColumnOrder
 
CombineFn<S,T> - Class in org.apache.crunch
A special DoFn implementation that converts an Iterable of values into a single value.
CombineFn() - Constructor for class org.apache.crunch.CombineFn
 
combineValues(CombineFn<K, V>) - Method in interface org.apache.crunch.PGroupedTable
Combines the values of this grouping using the given CombineFn.
combineValues(CombineFn<K, V>, CombineFn<K, V>) - Method in interface org.apache.crunch.PGroupedTable
Combines and reduces the values of this grouping using the given CombineFn instances.
combineValues(Aggregator<V>) - Method in interface org.apache.crunch.PGroupedTable
Combine the values in each group using the given Aggregator.
combineValues(Aggregator<V>, Aggregator<V>) - Method in interface org.apache.crunch.PGroupedTable
Combine and reduces the values in each group using the given Aggregator instances.
comm(PCollection<T>, PCollection<T>) - Static method in class org.apache.crunch.lib.Set
Find the elements that are common to two sets, like the Unix comm utility.
compare(Pair<K, V>, Pair<K, V>) - Method in class org.apache.crunch.lib.Aggregate.PairValueComparator
 
compare(AvroWrapper<T>, AvroWrapper<T>) - Method in class org.apache.crunch.lib.join.JoinUtils.AvroPairGroupingComparator
 
compare(byte[], int, int, byte[], int, int) - Method in class org.apache.crunch.lib.join.JoinUtils.AvroPairGroupingComparator
 
compare(TupleWritable, TupleWritable) - Method in class org.apache.crunch.lib.join.JoinUtils.TupleWritableComparator
 
compare(byte[], int, int, byte[], int, int) - Method in class org.apache.crunch.lib.join.JoinUtils.TupleWritableComparator
 
compare(AvroKey<T>, AvroKey<T>) - Method in class org.apache.crunch.lib.sort.ReverseAvroComparator
 
compare(byte[], int, int, byte[], int, int) - Method in class org.apache.crunch.lib.sort.ReverseAvroComparator
 
compare(byte[], int, int, byte[], int, int) - Method in class org.apache.crunch.lib.sort.ReverseWritableComparator
 
compare(T, T) - Method in class org.apache.crunch.lib.sort.ReverseWritableComparator
 
compare(WritableComparable, WritableComparable) - Method in class org.apache.crunch.lib.sort.TupleWritableComparator
 
compareTo(Pair<K, V>) - Method in class org.apache.crunch.Pair
 
compareTo(TupleWritable) - Method in class org.apache.crunch.types.writable.TupleWritable
 
CompositeMapFn<R,S,T> - Class in org.apache.crunch.fn
 
CompositeMapFn(MapFn<R, S>, MapFn<S, T>) - Constructor for class org.apache.crunch.fn.CompositeMapFn
 
CompositePathIterable<T> - Class in org.apache.crunch.io
 
conf(String, String) - Method in class org.apache.crunch.GroupingOptions.Builder
 
conf(String, String) - Method in class org.apache.crunch.ParallelDoOptions.Builder
Specifies key-value pairs that should be added to the Configuration object associated with the Job that includes these options.
conf(String, String) - Method in interface org.apache.crunch.SourceTarget
Adds the given key-value pair to the Configuration instance(s) that are used to read and write this SourceTarget<T>.
configure(Configuration) - Method in class org.apache.crunch.DoFn
Configure this DoFn.
configure(Configuration) - Method in class org.apache.crunch.fn.CompositeMapFn
 
configure(Configuration) - Method in class org.apache.crunch.fn.PairMapFn
 
configure(Job) - Method in class org.apache.crunch.GroupingOptions
 
configure(Configuration) - Method in class org.apache.crunch.io.FormatBundle
 
configure(Target, PType<?>) - Method in interface org.apache.crunch.io.OutputHandler
 
configure(Configuration) - Method in class org.apache.crunch.ParallelDoOptions
Applies the key-value pairs that were associated with this instance to the given Configuration object.
configure(Configuration) - Method in interface org.apache.crunch.ReadableData
Allows this instance to specify any additional configuration settings that may be needed by the job that it is launched in.
configure(Configuration) - Method in class org.apache.crunch.types.avro.AvroUtf8InputFormat
 
configure(Configuration) - Method in class org.apache.crunch.types.PGroupedTableType.PairIterableMapFn
 
configureForMapReduce(Job, PType<?>, Path, String) - Method in interface org.apache.crunch.io.MapReduceTarget
 
configureOrdering(Configuration, Sort.Order...) - Static method in class org.apache.crunch.lib.sort.TupleWritableComparator
 
configureOrdering(Configuration, Sort.ColumnOrder...) - Static method in class org.apache.crunch.lib.sort.TupleWritableComparator
 
configureReflectDataFactory(Configuration) - Static method in class org.apache.crunch.types.avro.Avros
 
configureShuffle(Job, GroupingOptions) - Method in class org.apache.crunch.types.PGroupedTableType
 
configureSource(Job, int) - Method in interface org.apache.crunch.Source
Configure the given job to use this source as an input.
containers(Class<T>) - Static method in class org.apache.crunch.types.avro.Avros
 
containers(Class<T>) - Method in class org.apache.crunch.types.avro.AvroTypeFamily
 
convert(PType<T>, PTypeFamily) - Static method in class org.apache.crunch.types.PTypeUtils
 
Converter<K,V,S,T> - Interface in org.apache.crunch.types
Converts the input key/value from a MapReduce task into the input to a DoFn, or takes the output of a DoFn and write it to the output key/values.
convertInput(K, V) - Method in interface org.apache.crunch.types.Converter
 
convertIterableInput(K, Iterable<V>) - Method in interface org.apache.crunch.types.Converter
 
copyResourceFile(String) - Method in class org.apache.crunch.test.TemporaryPath
Copy a classpath resource to File.
copyResourceFileName(String) - Method in class org.apache.crunch.test.TemporaryPath
Copy a classpath resource returning its absolute file name.
copyResourcePath(String) - Method in class org.apache.crunch.test.TemporaryPath
Copy a classpath resource to a Path.
count(PCollection<S>) - Static method in class org.apache.crunch.lib.Aggregate
Returns a PTable that contains the unique elements of this collection mapped to a count of their occurrences.
count(PCollection<S>, int) - Static method in class org.apache.crunch.lib.Aggregate
Returns a PTable that contains the unique elements of this collection mapped to a count of their occurrences.
count() - Method in interface org.apache.crunch.PCollection
Returns a PTable instance that contains the counts of each unique element of this PCollection.
create(String) - Method in class org.apache.crunch.contrib.text.TokenizerFactory
Return a Scanner instance that wraps the input string and uses the delimiter, skip, and locale settings for this TokenizerFactory instance.
create(FileSystem, Path, FileReaderFactory<S>) - Static method in class org.apache.crunch.io.CompositePathIterable
 
create(Class<T>, Class...) - Static method in class org.apache.crunch.types.TupleFactory
 
createFilter(Path, BloomFilterFn<String>) - Static method in class org.apache.crunch.contrib.bloomfilter.BloomFilterFactory
The method will take an input path and generates BloomFilters for all text files in that path.
createFilter(PCollection<T>, BloomFilterFn<T>) - Static method in class org.apache.crunch.contrib.bloomfilter.BloomFilterFactory
 
createIntermediateOutput(PType<T>) - Method in class org.apache.crunch.impl.mr.MRPipeline
 
createOrderedTupleSchema(PType<S>, Sort.ColumnOrder[]) - Static method in class org.apache.crunch.lib.sort.SortFns
Constructs an Avro schema for the given PType<S> that respects the given column orderings.
createPut(PTable<String, String>) - Method in class org.apache.crunch.examples.WordAggregationHBase
Create puts in order to insert them in hbase.
createRecordReader(InputSplit, TaskAttemptContext) - Method in class org.apache.crunch.types.avro.AvroInputFormat
 
createRecordReader(InputSplit, TaskAttemptContext) - Method in class org.apache.crunch.types.avro.AvroUtf8InputFormat
 
createTempPath() - Method in class org.apache.crunch.impl.mr.MRPipeline
 
cross(PTable<K1, U>, PTable<K2, V>) - Static method in class org.apache.crunch.lib.Cartesian
Performs a full cross join on the specified PTables (using the same strategy as Pig's CROSS operator).
cross(PTable<K1, U>, PTable<K2, V>, int) - Static method in class org.apache.crunch.lib.Cartesian
Performs a full cross join on the specified PTables (using the same strategy as Pig's CROSS operator).
cross(PCollection<U>, PCollection<V>) - Static method in class org.apache.crunch.lib.Cartesian
Performs a full cross join on the specified PCollections (using the same strategy as Pig's CROSS operator).
cross(PCollection<U>, PCollection<V>, int) - Static method in class org.apache.crunch.lib.Cartesian
Performs a full cross join on the specified PCollections (using the same strategy as Pig's CROSS operator).
CRUNCH_FILTER_NAME - Static variable in class org.apache.crunch.contrib.bloomfilter.BloomFilterFn
 
CRUNCH_FILTER_SIZE - Static variable in class org.apache.crunch.contrib.bloomfilter.BloomFilterFn
 
CRUNCH_INPUTS - Static variable in class org.apache.crunch.io.CrunchInputs
 
CRUNCH_OUTPUTS - Static variable in class org.apache.crunch.io.CrunchOutputs
 
CrunchInputs - Class in org.apache.crunch.io
Helper functions for configuring multiple InputFormat instances within a single Crunch MapReduce job.
CrunchInputs() - Constructor for class org.apache.crunch.io.CrunchInputs
 
CrunchOutputs<K,V> - Class in org.apache.crunch.io
An analogue of CrunchInputs for handling multiple OutputFormat instances writing to multiple files within a single MapReduce job.
CrunchOutputs(TaskInputOutputContext<?, ?, K, V>) - Constructor for class org.apache.crunch.io.CrunchOutputs
Creates and initializes multiple outputs support, it should be instantiated in the Mapper/Reducer setup method.
CrunchRuntimeException - Exception in org.apache.crunch
A RuntimeException implementation that includes some additional options for the Crunch execution engine to track reporting status.
CrunchRuntimeException(String) - Constructor for exception org.apache.crunch.CrunchRuntimeException
 
CrunchRuntimeException(Exception) - Constructor for exception org.apache.crunch.CrunchRuntimeException
 
CrunchRuntimeException(String, Exception) - Constructor for exception org.apache.crunch.CrunchRuntimeException
 
CrunchTestSupport - Class in org.apache.crunch.test
A temporary workaround for Scala tests to use when working with Rule annotations until it gets fixed in JUnit 4.11.
CrunchTestSupport() - Constructor for class org.apache.crunch.test.CrunchTestSupport
 
CrunchTool - Class in org.apache.crunch.util
An extension of the Tool interface that creates a Pipeline instance and provides methods for working with the Pipeline from inside of the Tool's run method.
CrunchTool() - Constructor for class org.apache.crunch.util.CrunchTool
 
CrunchTool(boolean) - Constructor for class org.apache.crunch.util.CrunchTool
 

D

DataBaseSource<T extends org.apache.hadoop.mapreduce.lib.db.DBWritable & org.apache.hadoop.io.Writable> - Class in org.apache.crunch.contrib.io.jdbc
Source from reading from a database via a JDBC connection.
DebugLogging - Class in org.apache.crunch.test
Allows direct manipulation of the Hadoop log4j settings to aid in unit testing.
DeepCopier<T> - Interface in org.apache.crunch.types
Performs deep copies of values.
DeepCopier.NoOpDeepCopier<V> - Class in org.apache.crunch.types
 
DeepCopier.NoOpDeepCopier() - Constructor for class org.apache.crunch.types.DeepCopier.NoOpDeepCopier
 
deepCopy(Collection<T>) - Method in class org.apache.crunch.types.CollectionDeepCopier
 
deepCopy(T) - Method in interface org.apache.crunch.types.DeepCopier
Create a deep copy of a value.
deepCopy(V) - Method in class org.apache.crunch.types.DeepCopier.NoOpDeepCopier
 
deepCopy(Map<String, T>) - Method in class org.apache.crunch.types.MapDeepCopier
 
deepCopy(T) - Method in class org.apache.crunch.types.TupleDeepCopier
 
deepCopy(T) - Method in class org.apache.crunch.types.writable.WritableDeepCopier
 
DEFAULT_BYTES_PER_REDUCE_TASK - Static variable in class org.apache.crunch.util.PartitionUtils
 
DEFAULT_MAX_REDUCERS - Static variable in class org.apache.crunch.util.PartitionUtils
 
DEFAULT_PATH - Static variable in class org.apache.crunch.lib.sort.TotalOrderPartitioner
 
DefaultJoinStrategy<K,U,V> - Class in org.apache.crunch.lib.join
Default join strategy that simply sends all data through the map, shuffle, and reduce phase.
DefaultJoinStrategy() - Constructor for class org.apache.crunch.lib.join.DefaultJoinStrategy
 
DefaultJoinStrategy(int) - Constructor for class org.apache.crunch.lib.join.DefaultJoinStrategy
 
delimiter(String) - Method in class org.apache.crunch.contrib.text.TokenizerFactory.Builder
Sets the delimiter used by the TokenizerFactory instances constructed by this instance.
derived(PType<V1>, PType<V2>, PType<V3>) - Static method in class org.apache.crunch.Tuple3.Collect
 
derived(PType<V1>, PType<V2>, PType<V3>, PType<V4>) - Static method in class org.apache.crunch.Tuple4.Collect
 
derived(Class<T>, MapFn<S, T>, MapFn<T, S>, PType<S>) - Static method in class org.apache.crunch.types.avro.Avros
 
derived(Class<T>, MapFn<S, T>, MapFn<T, S>, PType<S>) - Method in class org.apache.crunch.types.avro.AvroTypeFamily
 
derived(Class<T>, MapFn<S, T>, MapFn<T, S>, PType<S>) - Method in interface org.apache.crunch.types.PTypeFamily
 
derived(Class<T>, MapFn<S, T>, MapFn<T, S>, PType<S>) - Static method in class org.apache.crunch.types.writable.Writables
 
derived(Class<T>, MapFn<S, T>, MapFn<T, S>, PType<S>) - Method in class org.apache.crunch.types.writable.WritableTypeFamily
 
difference(PCollection<T>, PCollection<T>) - Static method in class org.apache.crunch.lib.Set
Compute the set difference between two sets of elements.
disableDeepCopy() - Method in class org.apache.crunch.DoFn
By default, Crunch will do a defensive deep copy of the outputs of a DoFn when there are multiple downstream consumers of that item, in order to prevent the downstream functions from making concurrent modifications to data objects.
DistCache - Class in org.apache.crunch.util
Provides functions for working with Hadoop's distributed cache.
DistCache() - Constructor for class org.apache.crunch.util.DistCache
 
Distinct - Class in org.apache.crunch.lib
Functions for computing the distinct elements of a PCollection.
distinct(PCollection<S>) - Static method in class org.apache.crunch.lib.Distinct
Construct a new PCollection that contains the unique elements of a given input PCollection.
distinct(PTable<K, V>) - Static method in class org.apache.crunch.lib.Distinct
A PTable<K, V> analogue of the distinct function.
distinct(PCollection<S>, int) - Static method in class org.apache.crunch.lib.Distinct
A distinct operation that gives the client more control over how frequently elements are flushed to disk in order to allow control over performance or memory consumption.
distinct(PTable<K, V>, int) - Static method in class org.apache.crunch.lib.Distinct
A PTable<K, V> analogue of the distinct function.
DoFn<S,T> - Class in org.apache.crunch
Base class for all data processing functions in Crunch.
DoFn() - Constructor for class org.apache.crunch.DoFn
 
done() - Method in class org.apache.crunch.impl.mem.MemPipeline
 
done() - Method in class org.apache.crunch.impl.mr.MRPipeline
 
done() - Method in interface org.apache.crunch.Pipeline
Run any remaining jobs required to generate outputs and then clean up any intermediate data files that were created in this run or previous calls to run.
done() - Method in class org.apache.crunch.util.CrunchTool
 
doubles() - Static method in class org.apache.crunch.types.avro.Avros
 
doubles() - Method in class org.apache.crunch.types.avro.AvroTypeFamily
 
doubles() - Method in interface org.apache.crunch.types.PTypeFamily
 
doubles() - Static method in class org.apache.crunch.types.writable.Writables
 
doubles() - Method in class org.apache.crunch.types.writable.WritableTypeFamily
 
drop(Integer...) - Method in class org.apache.crunch.contrib.text.TokenizerFactory.Builder
Drop the specified fields found by the input scanner, counting from zero.

E

emit(T) - Method in interface org.apache.crunch.Emitter
Write the emitted value to the next stage of the pipeline.
Emitter<T> - Interface in org.apache.crunch
Interface for writing outputs from a DoFn.
EMPTY - Static variable in class org.apache.crunch.PipelineResult
 
enable(Level) - Static method in class org.apache.crunch.test.DebugLogging
Enables logging Hadoop output to the console using the pattern '%-4r [%t] %-5p %c %x - %m%n' at the specified Level.
enable(Level, Appender) - Static method in class org.apache.crunch.test.DebugLogging
Enables logging to the given Appender at the specified Level.
enableDebug() - Method in class org.apache.crunch.impl.mem.MemPipeline
 
enableDebug() - Method in class org.apache.crunch.impl.mr.MRPipeline
 
enableDebug() - Method in interface org.apache.crunch.Pipeline
Turn on debug logging for jobs that are run from this pipeline.
enableDebug() - Method in class org.apache.crunch.util.CrunchTool
 
enums(Class<T>, PTypeFamily) - Static method in class org.apache.crunch.types.PTypes
 
equals(Object) - Method in class org.apache.crunch.io.FormatBundle
 
equals(Object) - Method in class org.apache.crunch.Pair
 
equals(Object) - Method in class org.apache.crunch.Tuple3
 
equals(Object) - Method in class org.apache.crunch.Tuple4
 
equals(Object) - Method in class org.apache.crunch.TupleN
 
equals(Object) - Method in class org.apache.crunch.types.avro.AvroType
 
equals(Object) - Method in class org.apache.crunch.types.writable.TupleWritable
equals(Object) - Method in class org.apache.crunch.types.writable.WritableType
 
errorOnLastRecord() - Method in class org.apache.crunch.contrib.text.AbstractCompositeExtractor
 
errorOnLastRecord() - Method in class org.apache.crunch.contrib.text.AbstractSimpleExtractor
 
errorOnLastRecord() - Method in interface org.apache.crunch.contrib.text.Extractor
Returns true if the last call to extract on this instance threw an exception that was handled.
extract(String) - Method in class org.apache.crunch.contrib.text.AbstractCompositeExtractor
 
extract(String) - Method in class org.apache.crunch.contrib.text.AbstractSimpleExtractor
 
extract(String) - Method in interface org.apache.crunch.contrib.text.Extractor
Extract a value with the type of this instance.
extractKey(String) - Static method in class org.apache.crunch.types.Protos
 
ExtractKeyFn<K,V> - Class in org.apache.crunch.fn
Wrapper function for converting a MapFn into a key-value pair that is used to convert from a PCollection<V> to a PTable<K, V>.
ExtractKeyFn(MapFn<V, K>) - Constructor for class org.apache.crunch.fn.ExtractKeyFn
 
Extractor<T> - Interface in org.apache.crunch.contrib.text
An interface for extracting a specific data type from a text string that is being processed by a Scanner object.
Extractors - Class in org.apache.crunch.contrib.text
Factory methods for constructing common Extractor types.
Extractors() - Constructor for class org.apache.crunch.contrib.text.Extractors
 
ExtractorStats - Class in org.apache.crunch.contrib.text
Records the number of kind of errors that an Extractor encountered when parsing input data.
ExtractorStats(int) - Constructor for class org.apache.crunch.contrib.text.ExtractorStats
 
ExtractorStats(int, List<Integer>) - Constructor for class org.apache.crunch.contrib.text.ExtractorStats
 
extractText(PTable<ImmutableBytesWritable, Result>) - Method in class org.apache.crunch.examples.WordAggregationHBase
Extract information from hbase

F

FileNamingScheme - Interface in org.apache.crunch.io
Encapsulates rules for naming output files.
FileReaderFactory<T> - Interface in org.apache.crunch.io
 
filter(FilterFn<S>) - Method in interface org.apache.crunch.PCollection
Apply the given filter function to this instance and return the resulting PCollection.
filter(String, FilterFn<S>) - Method in interface org.apache.crunch.PCollection
Apply the given filter function to this instance and return the resulting PCollection.
filter(FilterFn<Pair<K, V>>) - Method in interface org.apache.crunch.PTable
Apply the given filter function to this instance and return the resulting PTable.
filter(String, FilterFn<Pair<K, V>>) - Method in interface org.apache.crunch.PTable
Apply the given filter function to this instance and return the resulting PTable.
FilterFn<T> - Class in org.apache.crunch
A DoFn for the common case of filtering the members of a PCollection based on a boolean condition.
FilterFn() - Constructor for class org.apache.crunch.FilterFn
 
FilterFns - Class in org.apache.crunch.fn
A collection of pre-defined FilterFn implementations.
findContainingJar(Class<?>) - Static method in class org.apache.crunch.util.DistCache
Finds the path to a jar that contains the class provided, if any.
findCounter(Enum<?>) - Method in class org.apache.crunch.PipelineResult.StageResult
Deprecated. The Counter class changed incompatibly between Hadoop 1 and 2 (from a class to an interface) so user programs should avoid this method and use PipelineResult.StageResult.getCounterValue(Enum) and/or PipelineResult.StageResult.getCounterDisplayName(Enum).
first() - Method in class org.apache.crunch.Pair
 
first() - Method in class org.apache.crunch.Tuple3
 
first() - Method in class org.apache.crunch.Tuple4
 
FIRST_N(int) - Static method in class org.apache.crunch.fn.Aggregators
Return the first n values (or fewer if there are fewer values than n).
floats() - Static method in class org.apache.crunch.types.avro.Avros
 
floats() - Method in class org.apache.crunch.types.avro.AvroTypeFamily
 
floats() - Method in interface org.apache.crunch.types.PTypeFamily
 
floats() - Static method in class org.apache.crunch.types.writable.Writables
 
floats() - Method in class org.apache.crunch.types.writable.WritableTypeFamily
 
flush() - Method in interface org.apache.crunch.Emitter
Flushes any values cached by this emitter.
forInput(Class<T>) - Static method in class org.apache.crunch.io.FormatBundle
 
FormatBundle<K> - Class in org.apache.crunch.io
A combination of an InputFormat or OutputFormat and any extra configuration information that format class needs to run.
FormatBundle() - Constructor for class org.apache.crunch.io.FormatBundle
 
formattedFile(String, Class<? extends FileInputFormat<K, V>>, Class<K>, Class<V>) - Static method in class org.apache.crunch.io.From
Creates a TableSource<K, V> for reading data from files that have custom FileInputFormat<K, V> implementations not covered by the provided TableSource and Source factory methods.
formattedFile(Path, Class<? extends FileInputFormat<K, V>>, Class<K>, Class<V>) - Static method in class org.apache.crunch.io.From
Creates a TableSource<K, V> for reading data from files that have custom FileInputFormat<K, V> implementations not covered by the provided TableSource and Source factory methods.
formattedFile(String, Class<? extends FileInputFormat<?, ?>>, PType<K>, PType<V>) - Static method in class org.apache.crunch.io.From
Creates a TableSource<K, V> for reading data from files that have custom FileInputFormat implementations not covered by the provided TableSource and Source factory methods.
formattedFile(Path, Class<? extends FileInputFormat<?, ?>>, PType<K>, PType<V>) - Static method in class org.apache.crunch.io.From
Creates a TableSource<K, V> for reading data from files that have custom FileInputFormat implementations not covered by the provided TableSource and Source factory methods.
formattedFile(String, Class<? extends FileOutputFormat<K, V>>) - Static method in class org.apache.crunch.io.To
Creates a Target at the given path name that writes data to a custom FileOutputFormat.
formattedFile(Path, Class<? extends FileOutputFormat<K, V>>) - Static method in class org.apache.crunch.io.To
Creates a Target at the given Path that writes data to a custom FileOutputFormat.
forOutput(Class<T>) - Static method in class org.apache.crunch.io.FormatBundle
 
fourth() - Method in class org.apache.crunch.Tuple4
 
From - Class in org.apache.crunch.io
Static factory methods for creating common Source types.
From() - Constructor for class org.apache.crunch.io.From
 
fromSerialized(String, Class<T>) - Static method in class org.apache.crunch.io.FormatBundle
 
fullJoin(PTable<K, U>, PTable<K, V>) - Static method in class org.apache.crunch.lib.Join
Performs a full outer join on the specified PTables.
FullOuterJoinFn<K,U,V> - Class in org.apache.crunch.lib.join
Used to perform the last step of an full outer join.
FullOuterJoinFn(PType<K>, PType<U>) - Constructor for class org.apache.crunch.lib.join.FullOuterJoinFn
 

G

generateKeys(S) - Method in class org.apache.crunch.contrib.bloomfilter.BloomFilterFn
 
generics(Schema) - Static method in class org.apache.crunch.types.avro.Avros
 
generics(Schema) - Method in class org.apache.crunch.types.avro.AvroTypeFamily
 
get(int) - Method in class org.apache.crunch.Pair
 
get(int) - Method in interface org.apache.crunch.Tuple
Returns the Object at the given index.
get(int) - Method in class org.apache.crunch.Tuple3
 
get(int) - Method in class org.apache.crunch.Tuple4
 
get(int) - Method in class org.apache.crunch.TupleN
 
get(int) - Method in class org.apache.crunch.types.writable.TupleWritable
Get ith Writable from Tuple.
getByFn() - Method in class org.apache.crunch.lib.sort.SortFns.KeyExtraction
 
getConf() - Method in class org.apache.crunch.io.FormatBundle
 
getConf() - Method in class org.apache.crunch.lib.sort.TotalOrderPartitioner
 
getConf() - Method in class org.apache.crunch.lib.sort.TupleWritableComparator
 
getConf() - Method in class org.apache.crunch.util.CrunchTool
 
getConfiguration() - Method in class org.apache.crunch.impl.mem.MemPipeline
 
getConfiguration() - Method in class org.apache.crunch.impl.mr.MRPipeline
 
getConfiguration() - Method in interface org.apache.crunch.Pipeline
Returns the Configuration instance associated with this pipeline.
getConverter() - Method in interface org.apache.crunch.Source
Returns the Converter used for mapping the inputs from this instance into PCollection or PTable values.
getConverter(PType<?>) - Method in interface org.apache.crunch.Target
Returns the Converter to use for mapping from the output PCollection into the output values expected by this instance.
getConverter() - Method in class org.apache.crunch.types.avro.AvroType
 
getConverter() - Method in class org.apache.crunch.types.PGroupedTableType
 
getConverter() - Method in interface org.apache.crunch.types.PType
 
getConverter() - Method in class org.apache.crunch.types.writable.WritableType
 
getCounter(Enum<?>) - Static method in class org.apache.crunch.test.TestCounters
 
getCounter(String, String) - Static method in class org.apache.crunch.test.TestCounters
 
getCounterDisplayName(String, String) - Method in class org.apache.crunch.PipelineResult.StageResult
 
getCounterDisplayName(Enum<?>) - Method in class org.apache.crunch.PipelineResult.StageResult
 
getCounterNames() - Method in class org.apache.crunch.PipelineResult.StageResult
 
getCounters() - Static method in class org.apache.crunch.impl.mem.MemPipeline
 
getCounters() - Method in class org.apache.crunch.PipelineResult.StageResult
Deprecated. The Counter class changed incompatibly between Hadoop 1 and 2 (from a class to an interface) so user programs should avoid this method and use PipelineResult.StageResult.getCounterNames().
getCounterValue(String, String) - Method in class org.apache.crunch.PipelineResult.StageResult
 
getCounterValue(Enum<?>) - Method in class org.apache.crunch.PipelineResult.StageResult
 
getDefaultConfiguration() - Method in class org.apache.crunch.test.TemporaryPath
 
getDefaultFileSource(Path) - Method in class org.apache.crunch.types.avro.AvroType
 
getDefaultFileSource(Path) - Method in class org.apache.crunch.types.PGroupedTableType
 
getDefaultFileSource(Path) - Method in interface org.apache.crunch.types.PType
Returns a SourceTarget that is able to read/write data using the serialization format specified by this PType.
getDefaultFileSource(Path) - Method in class org.apache.crunch.types.writable.WritableType
 
getDefaultInstance() - Static method in class org.apache.crunch.contrib.text.TokenizerFactory
Returns a default TokenizerFactory that uses whitespace as a delimiter and does not skip any input fields.
getDefaultInstance(Class<M>) - Static method in class org.apache.crunch.types.Protos
Utility function for creating a default PB Messgae from a Class object that works with both protoc 2.3.0 and 2.4.x.
getDefaultValue() - Method in class org.apache.crunch.contrib.text.AbstractSimpleExtractor
 
getDefaultValue() - Method in interface org.apache.crunch.contrib.text.Extractor
Returns the default value for this Extractor in case of an error.
getDependentJobs() - Method in interface org.apache.crunch.impl.mr.MRJob
 
getDetachedValue(PTableType<K, V>, Pair<K, V>) - Static method in class org.apache.crunch.lib.PTables
Create a detached value for a table Pair.
getDetachedValue(T) - Method in class org.apache.crunch.types.avro.AvroType
 
getDetachedValue(T) - Method in interface org.apache.crunch.types.PType
Returns a copy of a value (or the value itself) that can safely be retained.
getDetachedValue(T) - Method in class org.apache.crunch.types.writable.WritableType
 
getErrorCount() - Method in class org.apache.crunch.contrib.text.ExtractorStats
The overall number of records that had some kind of parsing error.
getFamily() - Method in class org.apache.crunch.types.avro.AvroType
 
getFamily() - Method in class org.apache.crunch.types.PGroupedTableType
 
getFamily() - Method in interface org.apache.crunch.types.PType
Returns the PTypeFamily that this PType belongs to.
getFamily() - Method in class org.apache.crunch.types.writable.WritableType
 
getFieldErrors() - Method in class org.apache.crunch.contrib.text.ExtractorStats
Returns the number of errors that occurred when parsing the individual fields of a composite record type, like a Pair or TupleN.
getFile(String) - Method in class org.apache.crunch.test.TemporaryPath
Get a File below the temporary directory.
getFileName(String) - Method in class org.apache.crunch.test.TemporaryPath
Get an absolute file name below the temporary directory.
getFileNamingScheme() - Method in interface org.apache.crunch.io.PathTarget
Get the naming scheme to be used for outputs being written to an output path.
getFirst() - Method in class org.apache.crunch.fn.CompositeMapFn
 
getFormatClass() - Method in class org.apache.crunch.io.FormatBundle
 
getFormatNodeMap(JobContext) - Static method in class org.apache.crunch.io.CrunchInputs
 
getGroupedDetachedValue(PGroupedTableType<K, V>, Pair<K, Iterable<V>>) - Static method in class org.apache.crunch.lib.PTables
Created a detached value for a PGroupedTable value.
getGroupedTableType() - Method in interface org.apache.crunch.PGroupedTable
Return the PGroupedTableType containing serialization information for this PGroupedTable.
getGroupedTableType() - Method in interface org.apache.crunch.types.PTableType
Returns the grouped table version of this type.
getGroupingComparator(PTypeFamily) - Static method in class org.apache.crunch.lib.join.JoinUtils
 
getGroupingComparatorClass() - Method in class org.apache.crunch.GroupingOptions
 
getGroupingConverter() - Method in class org.apache.crunch.types.PGroupedTableType
 
getInputMapFn() - Method in class org.apache.crunch.types.avro.AvroType
 
getInputMapFn() - Method in interface org.apache.crunch.types.PType
 
getInputMapFn() - Method in class org.apache.crunch.types.writable.WritableType
 
getInstance() - Static method in class org.apache.crunch.fn.IdentityFn
 
getInstance() - Static method in class org.apache.crunch.impl.mem.MemPipeline
 
getInstance() - Static method in class org.apache.crunch.io.SequentialFileNamingScheme
 
getInstance() - Static method in class org.apache.crunch.types.avro.AvroTypeFamily
 
getInstance() - Static method in class org.apache.crunch.types.writable.WritableTypeFamily
 
getJob() - Method in interface org.apache.crunch.impl.mr.MRJob
 
getJobID() - Method in interface org.apache.crunch.impl.mr.MRJob
 
getJobs() - Method in interface org.apache.crunch.impl.mr.MRPipelineExecution
 
getJobState() - Method in interface org.apache.crunch.impl.mr.MRJob
 
getJoinType() - Method in class org.apache.crunch.lib.join.FullOuterJoinFn
getJoinType() - Method in class org.apache.crunch.lib.join.InnerJoinFn
 
getJoinType() - Method in class org.apache.crunch.lib.join.JoinFn
 
getJoinType() - Method in class org.apache.crunch.lib.join.LeftOuterJoinFn
getJoinType() - Method in class org.apache.crunch.lib.join.RightOuterJoinFn
getKeyClass() - Method in interface org.apache.crunch.types.Converter
 
getKeyType() - Method in class org.apache.crunch.lib.sort.SortFns.KeyExtraction
 
getKeyType() - Method in interface org.apache.crunch.PTable
Returns the PType of the key.
getKeyType() - Method in interface org.apache.crunch.types.PTableType
Returns the key type for the table.
getLastModifiedAt(Configuration) - Method in class org.apache.crunch.contrib.io.jdbc.DataBaseSource
 
getLastModifiedAt(FileSystem, Path) - Static method in class org.apache.crunch.io.SourceTargetHelper
 
getLastModifiedAt(Configuration) - Method in interface org.apache.crunch.Source
Returns the time (in milliseconds) that this Source was most recently modified (e.g., because an input file was edited or new files were added to a directory.)
getMapOutputName(Configuration, Path) - Method in interface org.apache.crunch.io.FileNamingScheme
Get the output file name for a map task.
getMapOutputName(Configuration, Path) - Method in class org.apache.crunch.io.SequentialFileNamingScheme
 
getMaterializeSourceTarget(PCollection<T>) - Method in class org.apache.crunch.impl.mr.MRPipeline
Retrieve a ReadableSourceTarget that provides access to the contents of a PCollection.
getName() - Method in class org.apache.crunch.impl.mem.MemPipeline
 
getName() - Method in class org.apache.crunch.impl.mr.MRPipeline
 
getName() - Method in class org.apache.crunch.io.FormatBundle
 
getName() - Method in interface org.apache.crunch.PCollection
Returns a shorthand name for this PCollection.
getName() - Method in interface org.apache.crunch.Pipeline
Returns the name of this pipeline.
getNextAnonymousStageId() - Method in class org.apache.crunch.impl.mr.MRPipeline
 
getNumReducers() - Method in class org.apache.crunch.GroupingOptions
 
getNumShards(K) - Method in interface org.apache.crunch.lib.join.ShardedJoinStrategy.ShardingStrategy
Retrieve the number of shards over which the given key should be split.
getOutputMapFn() - Method in class org.apache.crunch.types.avro.AvroType
 
getOutputMapFn() - Method in interface org.apache.crunch.types.PType
 
getOutputMapFn() - Method in class org.apache.crunch.types.writable.WritableType
 
getPartition(AvroKey<K>, AvroValue<V>, int) - Method in class org.apache.crunch.lib.join.JoinUtils.AvroIndexedRecordPartitioner
 
getPartition(TupleWritable, Writable, int) - Method in class org.apache.crunch.lib.join.JoinUtils.TupleWritablePartitioner
 
getPartition(K, V, int) - Method in class org.apache.crunch.lib.sort.TotalOrderPartitioner
 
getPartitionerClass() - Method in class org.apache.crunch.GroupingOptions
 
getPartitionerClass(PTypeFamily) - Static method in class org.apache.crunch.lib.join.JoinUtils
 
getPartitionFile(Configuration) - Static method in class org.apache.crunch.lib.sort.TotalOrderPartitioner
 
getPath() - Method in interface org.apache.crunch.io.PathTarget
 
getPath(String) - Method in class org.apache.crunch.test.TemporaryPath
Get a Path below the temporary directory.
getPathSize(Configuration, Path) - Static method in class org.apache.crunch.io.SourceTargetHelper
 
getPathSize(FileSystem, Path) - Static method in class org.apache.crunch.io.SourceTargetHelper
 
getPathToCacheFile(Path, Configuration) - Static method in class org.apache.crunch.util.DistCache
 
getPipeline() - Method in interface org.apache.crunch.PCollection
Returns the Pipeline associated with this PCollection.
getPlanDotFile() - Method in interface org.apache.crunch.PipelineExecution
Returns the .dot file that allows a client to graph the Crunch execution plan for this pipeline.
getPrimitiveType(Class<T>) - Static method in class org.apache.crunch.types.avro.Avros
 
getPrimitiveType(Class<T>) - Static method in class org.apache.crunch.types.writable.Writables
 
getPTableType() - Method in interface org.apache.crunch.PTable
Returns the PTableType of this PTable.
getPType(PTypeFamily) - Method in interface org.apache.crunch.contrib.text.Extractor
Returns the PType associated with this data type for the given PTypeFamily.
getPType() - Method in interface org.apache.crunch.PCollection
Returns the PType of this PCollection.
getReader(Schema) - Method in class org.apache.crunch.types.avro.ReflectDataFactory
 
getRecommendedPartitions(PCollection<T>) - Static method in class org.apache.crunch.util.PartitionUtils
 
getRecommendedPartitions(PCollection<T>, Configuration) - Static method in class org.apache.crunch.util.PartitionUtils
 
getRecordWriter(TaskAttemptContext) - Method in class org.apache.crunch.types.avro.AvroOutputFormat
 
getRecordWriter(TaskAttemptContext) - Method in class org.apache.crunch.types.avro.AvroTextOutputFormat
 
getReduceOutputName(Configuration, Path, int) - Method in interface org.apache.crunch.io.FileNamingScheme
Get the output file name for a reduce task.
getReduceOutputName(Configuration, Path, int) - Method in class org.apache.crunch.io.SequentialFileNamingScheme
 
getReflectData() - Method in class org.apache.crunch.types.avro.ReflectDataFactory
 
getReflectDataFactory(Configuration) - Static method in class org.apache.crunch.types.avro.Avros
 
getResult() - Method in interface org.apache.crunch.PipelineExecution
Retrieve the result of a pipeline if it has been completed, otherwise null.
getRootFile() - Method in class org.apache.crunch.test.TemporaryPath
Get the root directory which will be deleted automatically.
getRootFileName() - Method in class org.apache.crunch.test.TemporaryPath
Get the root directory as an absolute file name.
getRootPath() - Method in class org.apache.crunch.test.TemporaryPath
Get the root directory as a Path.
getSchema() - Method in class org.apache.crunch.types.avro.AvroType
 
getSecond() - Method in class org.apache.crunch.fn.CompositeMapFn
 
getSerializationClass() - Method in class org.apache.crunch.types.writable.WritableType
 
getSize(Configuration) - Method in class org.apache.crunch.contrib.io.jdbc.DataBaseSource
 
getSize() - Method in interface org.apache.crunch.PCollection
Returns the size of the data represented by this PCollection in bytes.
getSize(Configuration) - Method in interface org.apache.crunch.Source
Returns the number of bytes in this Source.
getSortComparatorClass() - Method in class org.apache.crunch.GroupingOptions
 
getSourceTargets() - Method in class org.apache.crunch.GroupingOptions
 
getSourceTargets() - Method in class org.apache.crunch.ParallelDoOptions
 
getSourceTargets() - Method in interface org.apache.crunch.ReadableData
 
getStageId() - Method in class org.apache.crunch.PipelineResult.StageResult
 
getStageName() - Method in class org.apache.crunch.PipelineResult.StageResult
 
getStageResults() - Method in class org.apache.crunch.PipelineResult
 
getStats() - Method in class org.apache.crunch.contrib.text.AbstractCompositeExtractor
 
getStats() - Method in class org.apache.crunch.contrib.text.AbstractSimpleExtractor
 
getStats() - Method in interface org.apache.crunch.contrib.text.Extractor
Return statistics about how many errors this Extractor instance encountered while parsing input data.
getStatus() - Method in interface org.apache.crunch.PipelineExecution
 
getSubTypes() - Method in class org.apache.crunch.types.avro.AvroType
 
getSubTypes() - Method in class org.apache.crunch.types.PGroupedTableType
 
getSubTypes() - Method in interface org.apache.crunch.types.PType
Returns the sub-types that make up this PType if it is a composite instance, such as a tuple.
getSubTypes() - Method in class org.apache.crunch.types.writable.WritableType
 
getTableType() - Method in interface org.apache.crunch.TableSource
 
getTableType() - Method in class org.apache.crunch.types.PGroupedTableType
 
getTestContext(Configuration) - Static method in class org.apache.crunch.test.CrunchTestSupport
The method creates a TaskInputOutputContext which can be used in unit tests.
getTupleFactory(Class<T>) - Static method in class org.apache.crunch.types.TupleFactory
Get the TupleFactory for a given Tuple implementation.
getType() - Method in interface org.apache.crunch.Source
Returns the PType for this source.
getTypeClass() - Method in class org.apache.crunch.types.avro.AvroType
 
getTypeClass() - Method in interface org.apache.crunch.types.PType
Returns the Java type represented by this PType.
getTypeClass() - Method in class org.apache.crunch.types.writable.WritableType
 
getTypeFamily() - Method in interface org.apache.crunch.PCollection
Returns the PTypeFamily of this PCollection.
getValue() - Method in interface org.apache.crunch.PObject
Gets the value associated with this PObject.
getValueClass() - Method in interface org.apache.crunch.types.Converter
 
getValueType() - Method in interface org.apache.crunch.PTable
Returns the PType of the value.
getValueType() - Method in interface org.apache.crunch.types.PTableType
Returns the value type for the table.
getWriter(Schema) - Method in class org.apache.crunch.types.avro.ReflectDataFactory
 
groupByKey() - Method in interface org.apache.crunch.PTable
Performs a grouping operation on the keys of this table.
groupByKey(int) - Method in interface org.apache.crunch.PTable
Performs a grouping operation on the keys of this table, using the given number of partitions.
groupByKey(GroupingOptions) - Method in interface org.apache.crunch.PTable
Performs a grouping operation on the keys of this table, using the additional GroupingOptions to control how the grouping is executed.
groupedWeightedReservoirSample(PTable<Integer, Pair<T, N>>, int[]) - Static method in class org.apache.crunch.lib.Sample
The most general purpose of the weighted reservoir sampling patterns that allows us to choose a random sample of elements for each of N input groups.
groupedWeightedReservoirSample(PTable<Integer, Pair<T, N>>, int[], Long) - Static method in class org.apache.crunch.lib.Sample
Same as the other groupedWeightedReservoirSample method, but include a seed for testing purposes.
groupingComparatorClass(Class<? extends RawComparator>) - Method in class org.apache.crunch.GroupingOptions.Builder
 
GroupingOptions - Class in org.apache.crunch
Options that can be passed to a groupByKey operation in order to exercise finer control over how the partitioning, grouping, and sorting of keys is performed.
GroupingOptions.Builder - Class in org.apache.crunch
Builder class for creating GroupingOptions instances.
GroupingOptions.Builder() - Constructor for class org.apache.crunch.GroupingOptions.Builder
 

H

handleExisting(Target.WriteMode, long, Configuration) - Method in interface org.apache.crunch.Target
Apply the given WriteMode to this Target instance.
handleOutputs(Configuration, Path, int) - Method in interface org.apache.crunch.io.PathTarget
Handles moving the output data for this target from a temporary location on the filesystem to its target path at the end of a MapReduce job.
has(int) - Method in class org.apache.crunch.types.writable.TupleWritable
Return true if tuple has an element at the position provided.
hashCode() - Method in class org.apache.crunch.io.FormatBundle
 
hashCode() - Method in class org.apache.crunch.Pair
 
hashCode() - Method in class org.apache.crunch.Tuple3
 
hashCode() - Method in class org.apache.crunch.Tuple4
 
hashCode() - Method in class org.apache.crunch.TupleN
 
hashCode() - Method in class org.apache.crunch.types.avro.AvroType
 
hashCode() - Method in class org.apache.crunch.types.writable.TupleWritable
 
hashCode() - Method in class org.apache.crunch.types.writable.WritableType
 
hasNext() - Method in class org.apache.crunch.contrib.text.Tokenizer
Returns true if the underlying Scanner has any tokens remaining.
hasReflect() - Method in class org.apache.crunch.types.avro.AvroType
Determine if the wrapped type is a reflection-based avro type or wraps one.
hasSpecific() - Method in class org.apache.crunch.types.avro.AvroType
Determine if the wrapped type is a specific data avro type or wraps one.

I

id - Variable in class org.apache.crunch.contrib.io.jdbc.IdentifiableName
 
IdentifiableName - Class in org.apache.crunch.contrib.io.jdbc
 
IdentifiableName() - Constructor for class org.apache.crunch.contrib.io.jdbc.IdentifiableName
 
IdentityFn<T> - Class in org.apache.crunch.fn
 
initialize(Configuration) - Method in interface org.apache.crunch.Aggregator
Perform any setup of this instance that is required prior to processing inputs.
initialize() - Method in class org.apache.crunch.contrib.bloomfilter.BloomFilterFn
 
initialize() - Method in class org.apache.crunch.contrib.text.AbstractCompositeExtractor
 
initialize() - Method in class org.apache.crunch.contrib.text.AbstractSimpleExtractor
 
initialize() - Method in interface org.apache.crunch.contrib.text.Extractor
Perform any initialization required by this Extractor during the start of a map or reduce task.
initialize() - Method in class org.apache.crunch.DoFn
Initialize this DoFn.
initialize(Configuration) - Method in class org.apache.crunch.fn.Aggregators.SimpleAggregator
 
initialize() - Method in class org.apache.crunch.fn.CompositeMapFn
 
initialize() - Method in class org.apache.crunch.fn.ExtractKeyFn
 
initialize() - Method in class org.apache.crunch.fn.PairMapFn
 
initialize() - Method in class org.apache.crunch.lib.Aggregate.TopKFn
 
initialize() - Method in class org.apache.crunch.lib.join.FullOuterJoinFn
Initialize this DoFn.
initialize() - Method in class org.apache.crunch.lib.join.InnerJoinFn
 
initialize() - Method in class org.apache.crunch.lib.join.JoinFn
 
initialize() - Method in class org.apache.crunch.lib.join.LeftOuterJoinFn
Initialize this DoFn.
initialize() - Method in class org.apache.crunch.lib.join.RightOuterJoinFn
Initialize this DoFn.
initialize() - Method in class org.apache.crunch.lib.sort.SortFns.AvroGenericFn
 
initialize(Configuration) - Method in class org.apache.crunch.types.avro.AvroType
 
initialize(Configuration) - Method in class org.apache.crunch.types.CollectionDeepCopier
 
initialize(Configuration) - Method in interface org.apache.crunch.types.DeepCopier
Initialize the deep copier with a job-specific configuration
initialize(Configuration) - Method in class org.apache.crunch.types.DeepCopier.NoOpDeepCopier
 
initialize(Configuration) - Method in class org.apache.crunch.types.MapDeepCopier
 
initialize() - Method in class org.apache.crunch.types.PGroupedTableType.PairIterableMapFn
 
initialize(Configuration) - Method in interface org.apache.crunch.types.PType
Initialize this PType for use within a DoFn.
initialize(Configuration) - Method in class org.apache.crunch.types.TupleDeepCopier
 
initialize() - Method in class org.apache.crunch.types.TupleFactory
 
initialize(Configuration) - Method in class org.apache.crunch.types.writable.WritableDeepCopier
 
initialize(Configuration) - Method in class org.apache.crunch.types.writable.WritableType
 
innerJoin(PTable<K, U>, PTable<K, V>) - Static method in class org.apache.crunch.lib.Join
Performs an inner join on the specified PTables.
InnerJoinFn<K,U,V> - Class in org.apache.crunch.lib.join
Used to perform the last step of an inner join.
InnerJoinFn(PType<K>, PType<U>) - Constructor for class org.apache.crunch.lib.join.InnerJoinFn
 
inputConf(String, String) - Method in interface org.apache.crunch.Source
Adds the given key-value pair to the Configuration instance that is used to read this Source<T></T>.
intersection(PCollection<T>, PCollection<T>) - Static method in class org.apache.crunch.lib.Set
Compute the intersection of two sets of elements.
ints() - Static method in class org.apache.crunch.types.avro.Avros
 
ints() - Method in class org.apache.crunch.types.avro.AvroTypeFamily
 
ints() - Method in interface org.apache.crunch.types.PTypeFamily
 
ints() - Static method in class org.apache.crunch.types.writable.Writables
 
ints() - Method in class org.apache.crunch.types.writable.WritableTypeFamily
 
isCompatibleWith(GroupingOptions) - Method in class org.apache.crunch.GroupingOptions
 
isGeneric() - Method in class org.apache.crunch.types.avro.AvroType
Determine if the wrapped type is a generic data avro type.
iterator() - Method in class org.apache.crunch.impl.SingleUseIterable
 
iterator() - Method in class org.apache.crunch.io.CompositePathIterable
 
iterator() - Method in class org.apache.crunch.util.Tuples.PairIterable
 
iterator() - Method in class org.apache.crunch.util.Tuples.QuadIterable
 
iterator() - Method in class org.apache.crunch.util.Tuples.TripIterable
 
iterator() - Method in class org.apache.crunch.util.Tuples.TupleNIterable
 

J

Join - Class in org.apache.crunch.lib
Utilities for joining multiple PTable instances based on a common lastKey.
Join() - Constructor for class org.apache.crunch.lib.Join
 
join(PTable<K, U>, PTable<K, V>, JoinType) - Method in class org.apache.crunch.lib.join.BloomFilterJoinStrategy
 
join(PTable<K, U>, PTable<K, V>, JoinType) - Method in class org.apache.crunch.lib.join.DefaultJoinStrategy
 
join(PTable<K, U>, PTable<K, V>, JoinFn<K, U, V>) - Method in class org.apache.crunch.lib.join.DefaultJoinStrategy
Perform a default join on the given PTable instances using a user-specified JoinFn.
join(K, int, Iterable<Pair<U, V>>, Emitter<Pair<K, Pair<U, V>>>) - Method in class org.apache.crunch.lib.join.FullOuterJoinFn
Performs the actual joining.
join(K, int, Iterable<Pair<U, V>>, Emitter<Pair<K, Pair<U, V>>>) - Method in class org.apache.crunch.lib.join.InnerJoinFn
 
join(PTable<K, U>, PTable<K, V>) - Static method in class org.apache.crunch.lib.Join
Performs an inner join on the specified PTables.
join(K, int, Iterable<Pair<U, V>>, Emitter<Pair<K, Pair<U, V>>>) - Method in class org.apache.crunch.lib.join.JoinFn
Performs the actual joining.
join(PTable<K, U>, PTable<K, V>, JoinType) - Method in interface org.apache.crunch.lib.join.JoinStrategy
Join two tables with the given join type.
join(K, int, Iterable<Pair<U, V>>, Emitter<Pair<K, Pair<U, V>>>) - Method in class org.apache.crunch.lib.join.LeftOuterJoinFn
Performs the actual joining.
join(PTable<K, U>, PTable<K, V>, JoinType) - Method in class org.apache.crunch.lib.join.MapsideJoinStrategy
 
join(K, int, Iterable<Pair<U, V>>, Emitter<Pair<K, Pair<U, V>>>) - Method in class org.apache.crunch.lib.join.RightOuterJoinFn
Performs the actual joining.
join(PTable<K, U>, PTable<K, V>, JoinType) - Method in class org.apache.crunch.lib.join.ShardedJoinStrategy
 
join(PTable<K, U>) - Method in interface org.apache.crunch.PTable
Perform an inner join on this table and the one passed in as an argument on their common keys.
JoinFn<K,U,V> - Class in org.apache.crunch.lib.join
Represents a DoFn for performing joins.
JoinFn(PType<K>, PType<U>) - Constructor for class org.apache.crunch.lib.join.JoinFn
Instantiate with the PType of the value of the left side of the join (used for creating deep copies of values).
JoinStrategy<K,U,V> - Interface in org.apache.crunch.lib.join
Defines a strategy for joining two PTables together on a common key.
JoinType - Enum in org.apache.crunch.lib.join
Specifies the specific behavior of how a join should be performed in terms of requiring matching keys on both sides of the join.
JoinUtils - Class in org.apache.crunch.lib.join
Utilities that are useful in joining multiple data sets via a MapReduce.
JoinUtils() - Constructor for class org.apache.crunch.lib.join.JoinUtils
 
JoinUtils.AvroIndexedRecordPartitioner<K,V> - Class in org.apache.crunch.lib.join
 
JoinUtils.AvroIndexedRecordPartitioner() - Constructor for class org.apache.crunch.lib.join.JoinUtils.AvroIndexedRecordPartitioner
 
JoinUtils.AvroPairGroupingComparator<T> - Class in org.apache.crunch.lib.join
 
JoinUtils.AvroPairGroupingComparator() - Constructor for class org.apache.crunch.lib.join.JoinUtils.AvroPairGroupingComparator
 
JoinUtils.TupleWritableComparator - Class in org.apache.crunch.lib.join
 
JoinUtils.TupleWritableComparator() - Constructor for class org.apache.crunch.lib.join.JoinUtils.TupleWritableComparator
 
JoinUtils.TupleWritablePartitioner - Class in org.apache.crunch.lib.join
 
JoinUtils.TupleWritablePartitioner() - Constructor for class org.apache.crunch.lib.join.JoinUtils.TupleWritablePartitioner
 
jsons(Class<T>) - Static method in class org.apache.crunch.types.avro.Avros
 
jsons(Class<T>) - Static method in class org.apache.crunch.types.writable.Writables
 
jsonString(Class<T>, PTypeFamily) - Static method in class org.apache.crunch.types.PTypes
 

K

keep(Integer...) - Method in class org.apache.crunch.contrib.text.TokenizerFactory.Builder
Keep only the specified fields found by the input scanner, counting from zero.
keys(PTable<K, V>) - Static method in class org.apache.crunch.lib.PTables
Extract the keys from the given PTable<K, V> as a PCollection<K>.
keys() - Method in interface org.apache.crunch.PTable
Returns a PCollection made up of the keys in this PTable.
kill() - Method in interface org.apache.crunch.PipelineExecution
Kills the pipeline if it is running, no-op otherwise.

L

LAST_N(int) - Static method in class org.apache.crunch.fn.Aggregators
Return the last n values (or fewer if there are fewer values than n).
leftJoin(PTable<K, U>, PTable<K, V>) - Static method in class org.apache.crunch.lib.Join
Performs a left outer join on the specified PTables.
LeftOuterJoinFn<K,U,V> - Class in org.apache.crunch.lib.join
Used to perform the last step of an left outer join.
LeftOuterJoinFn(PType<K>, PType<U>) - Constructor for class org.apache.crunch.lib.join.LeftOuterJoinFn
 
length(PCollection<S>) - Static method in class org.apache.crunch.lib.Aggregate
Returns the number of elements in the provided PCollection.
length() - Method in interface org.apache.crunch.PCollection
Returns the number of elements represented by this PCollection.
lineParser(String, Class<M>) - Static method in class org.apache.crunch.types.Protos
 
locale(Locale) - Method in class org.apache.crunch.contrib.text.TokenizerFactory.Builder
Sets the Locale to use with the TokenizerFactory returned by this Builder instance.
longs() - Static method in class org.apache.crunch.types.avro.Avros
 
longs() - Method in class org.apache.crunch.types.avro.AvroTypeFamily
 
longs() - Method in interface org.apache.crunch.types.PTypeFamily
 
longs() - Static method in class org.apache.crunch.types.writable.Writables
 
longs() - Method in class org.apache.crunch.types.writable.WritableTypeFamily
 

M

main(String[]) - Static method in class org.apache.crunch.examples.AverageBytesByIP
 
main(String[]) - Static method in class org.apache.crunch.examples.SecondarySortExample
 
main(String[]) - Static method in class org.apache.crunch.examples.SortExample
 
main(String[]) - Static method in class org.apache.crunch.examples.TotalBytesByIP
 
main(String[]) - Static method in class org.apache.crunch.examples.WordAggregationHBase
 
main(String[]) - Static method in class org.apache.crunch.examples.WordCount
 
makeTuple(Object...) - Method in class org.apache.crunch.types.TupleFactory
 
map(R) - Method in class org.apache.crunch.fn.CompositeMapFn
 
map(V) - Method in class org.apache.crunch.fn.ExtractKeyFn
 
map(T) - Method in class org.apache.crunch.fn.IdentityFn
 
map(Pair<K, V>) - Method in class org.apache.crunch.fn.PairMapFn
 
map(PTable<K1, V1>, Class<? extends Mapper<K1, V1, K2, V2>>, Class<K2>, Class<V2>) - Static method in class org.apache.crunch.lib.Mapred
 
map(PTable<K1, V1>, Class<? extends Mapper<K1, V1, K2, V2>>, Class<K2>, Class<V2>) - Static method in class org.apache.crunch.lib.Mapreduce
 
map(V) - Method in class org.apache.crunch.lib.sort.SortFns.AvroGenericFn
 
map(V) - Method in class org.apache.crunch.lib.sort.SortFns.SingleKeyFn
 
map(V) - Method in class org.apache.crunch.lib.sort.SortFns.TupleKeyFn
 
map(S) - Method in class org.apache.crunch.MapFn
Maps the given input into an instance of the output type.
map(Pair<Object, Iterable<Object>>) - Method in class org.apache.crunch.types.PGroupedTableType.PairIterableMapFn
 
MapDeepCopier<T> - Class in org.apache.crunch.types
 
MapDeepCopier(PType<T>) - Constructor for class org.apache.crunch.types.MapDeepCopier
 
MapFn<S,T> - Class in org.apache.crunch
A DoFn for the common case of emitting exactly one value for each input record.
MapFn() - Constructor for class org.apache.crunch.MapFn
 
mapKeys(PTable<K1, V>, MapFn<K1, K2>, PType<K2>) - Static method in class org.apache.crunch.lib.PTables
Maps a PTable<K1, V> to a PTable<K2, V> using the given MapFn<K1, K2> on the keys of the PTable.
mapKeys(String, PTable<K1, V>, MapFn<K1, K2>, PType<K2>) - Static method in class org.apache.crunch.lib.PTables
Maps a PTable<K1, V> to a PTable<K2, V> using the given MapFn<K1, K2> on the keys of the PTable.
mapKeys(MapFn<K, K2>, PType<K2>) - Method in interface org.apache.crunch.PTable
Returns a PTable that has the same values as this instance, but uses the given function to map the keys.
mapKeys(String, MapFn<K, K2>, PType<K2>) - Method in interface org.apache.crunch.PTable
Returns a PTable that has the same values as this instance, but uses the given function to map the keys.
Mapred - Class in org.apache.crunch.lib
Static functions for working with legacy Mappers and Reducers that live under the org.apache.hadoop.mapred.* package as part of Crunch pipelines.
Mapred() - Constructor for class org.apache.crunch.lib.Mapred
 
Mapreduce - Class in org.apache.crunch.lib
Static functions for working with legacy Mappers and Reducers that live under the org.apache.hadoop.mapreduce.* package as part of Crunch pipelines.
Mapreduce() - Constructor for class org.apache.crunch.lib.Mapreduce
 
MapReduceTarget - Interface in org.apache.crunch.io
 
maps(PType<T>) - Static method in class org.apache.crunch.types.avro.Avros
 
maps(PType<T>) - Method in class org.apache.crunch.types.avro.AvroTypeFamily
 
maps(PType<T>) - Method in interface org.apache.crunch.types.PTypeFamily
 
maps(PType<T>) - Static method in class org.apache.crunch.types.writable.Writables
 
maps(PType<T>) - Method in class org.apache.crunch.types.writable.WritableTypeFamily
 
MapsideJoinStrategy<K,U,V> - Class in org.apache.crunch.lib.join
Utility for doing map side joins on a common key between two PTables.
MapsideJoinStrategy() - Constructor for class org.apache.crunch.lib.join.MapsideJoinStrategy
Constructs a new instance of the MapsideJoinStratey, materializing the right-side join table to disk before the join is performed.
MapsideJoinStrategy(boolean) - Constructor for class org.apache.crunch.lib.join.MapsideJoinStrategy
Constructs a new instance of the MapsideJoinStrategy.
mapValues(PTable<K, U>, MapFn<U, V>, PType<V>) - Static method in class org.apache.crunch.lib.PTables
Maps a PTable<K, U> to a PTable<K, V> using the given MapFn<U, V> on the values of the PTable.
mapValues(String, PTable<K, U>, MapFn<U, V>, PType<V>) - Static method in class org.apache.crunch.lib.PTables
Maps a PTable<K, U> to a PTable<K, V> using the given MapFn<U, V> on the values of the PTable.
mapValues(PGroupedTable<K, U>, MapFn<Iterable<U>, V>, PType<V>) - Static method in class org.apache.crunch.lib.PTables
An analogue of the mapValues function for PGroupedTable<K, U> collections.
mapValues(String, PGroupedTable<K, U>, MapFn<Iterable<U>, V>, PType<V>) - Static method in class org.apache.crunch.lib.PTables
An analogue of the mapValues function for PGroupedTable<K, U> collections.
mapValues(MapFn<Iterable<V>, U>, PType<U>) - Method in interface org.apache.crunch.PGroupedTable
Maps the Iterable<V> elements of each record to a new type.
mapValues(String, MapFn<Iterable<V>, U>, PType<U>) - Method in interface org.apache.crunch.PGroupedTable
Maps the Iterable<V> elements of each record to a new type.
mapValues(MapFn<V, U>, PType<U>) - Method in interface org.apache.crunch.PTable
Returns a PTable that has the same keys as this instance, but uses the given function to map the values.
mapValues(String, MapFn<V, U>, PType<U>) - Method in interface org.apache.crunch.PTable
Returns a PTable that has the same keys as this instance, but uses the given function to map the values.
markLogged() - Method in exception org.apache.crunch.CrunchRuntimeException
Indicate that this exception has been written to the debug logs.
materialize(PCollection<T>) - Method in class org.apache.crunch.impl.mem.MemPipeline
 
materialize(PCollection<T>) - Method in class org.apache.crunch.impl.mr.MRPipeline
 
materialize() - Method in interface org.apache.crunch.PCollection
Returns a reference to the data set represented by this PCollection that may be used by the client to read the data locally.
materialize(PCollection<T>) - Method in interface org.apache.crunch.Pipeline
Create the given PCollection and read the data it contains into the returned Collection instance for client use.
materialize(PCollection<T>) - Method in class org.apache.crunch.util.CrunchTool
 
materializeToMap() - Method in interface org.apache.crunch.PTable
Returns a Map made up of the keys and values in this PTable.
max(PCollection<S>) - Static method in class org.apache.crunch.lib.Aggregate
Returns the largest numerical element from the input collection.
max() - Method in interface org.apache.crunch.PCollection
Returns a PObject of the maximum element of this instance.
MAX_BIGINTS() - Static method in class org.apache.crunch.fn.Aggregators
Return the maximum of all given BigInteger values.
MAX_BIGINTS(int) - Static method in class org.apache.crunch.fn.Aggregators
Return the n largest BigInteger values (or fewer if there are fewer values than n).
MAX_DOUBLES() - Static method in class org.apache.crunch.fn.Aggregators
Return the maximum of all given double values.
MAX_DOUBLES(int) - Static method in class org.apache.crunch.fn.Aggregators
Return the n largest double values (or fewer if there are fewer values than n).
MAX_FLOATS() - Static method in class org.apache.crunch.fn.Aggregators
Return the maximum of all given float values.
MAX_FLOATS(int) - Static method in class org.apache.crunch.fn.Aggregators
Return the n largest float values (or fewer if there are fewer values than n).
MAX_INTS() - Static method in class org.apache.crunch.fn.Aggregators
Return the maximum of all given int values.
MAX_INTS(int) - Static method in class org.apache.crunch.fn.Aggregators
Return the n largest int values (or fewer if there are fewer values than n).
MAX_LONGS() - Static method in class org.apache.crunch.fn.Aggregators
Return the maximum of all given long values.
MAX_LONGS(int) - Static method in class org.apache.crunch.fn.Aggregators
Return the n largest long values (or fewer if there are fewer values than n).
MAX_N(int, Class<V>) - Static method in class org.apache.crunch.fn.Aggregators
Return the n largest values (or fewer if there are fewer values than n).
MAX_REDUCERS - Static variable in class org.apache.crunch.util.PartitionUtils
Set an upper limit on the number of reducers the Crunch planner will set for an MR job when it tries to determine how many reducers to use based on the input size.
MemPipeline - Class in org.apache.crunch.impl.mem
 
min(PCollection<S>) - Static method in class org.apache.crunch.lib.Aggregate
Returns the smallest numerical element from the input collection.
min() - Method in interface org.apache.crunch.PCollection
Returns a PObject of the minimum element of this instance.
MIN_BIGINTS() - Static method in class org.apache.crunch.fn.Aggregators
Return the minimum of all given BigInteger values.
MIN_BIGINTS(int) - Static method in class org.apache.crunch.fn.Aggregators
Return the n smallest BigInteger values (or fewer if there are fewer values than n).
MIN_DOUBLES() - Static method in class org.apache.crunch.fn.Aggregators
Return the minimum of all given double values.
MIN_DOUBLES(int) - Static method in class org.apache.crunch.fn.Aggregators
Return the n smallest double values (or fewer if there are fewer values than n).
MIN_FLOATS() - Static method in class org.apache.crunch.fn.Aggregators
Return the minimum of all given float values.
MIN_FLOATS(int) - Static method in class org.apache.crunch.fn.Aggregators
Return the n smallest float values (or fewer if there are fewer values than n).
MIN_INTS() - Static method in class org.apache.crunch.fn.Aggregators
Return the minimum of all given int values.
MIN_INTS(int) - Static method in class org.apache.crunch.fn.Aggregators
Return the n smallest int values (or fewer if there are fewer values than n).
MIN_LONGS() - Static method in class org.apache.crunch.fn.Aggregators
Return the minimum of all given long values.
MIN_LONGS(int) - Static method in class org.apache.crunch.fn.Aggregators
Return the n smallest long values (or fewer if there are fewer values than n).
MIN_N(int, Class<V>) - Static method in class org.apache.crunch.fn.Aggregators
Return the n smallest values (or fewer if there are fewer values than n).
MRJob - Interface in org.apache.crunch.impl.mr
A Hadoop MapReduce job managed by Crunch.
MRJob.State - Enum in org.apache.crunch.impl.mr
A job will be in one of the following states.
MRPipeline - Class in org.apache.crunch.impl.mr
Pipeline implementation that is executed within Hadoop MapReduce.
MRPipeline(Class<?>) - Constructor for class org.apache.crunch.impl.mr.MRPipeline
Instantiate with a default Configuration and name.
MRPipeline(Class<?>, String) - Constructor for class org.apache.crunch.impl.mr.MRPipeline
Instantiate with a custom pipeline name.
MRPipeline(Class<?>, Configuration) - Constructor for class org.apache.crunch.impl.mr.MRPipeline
Instantiate with a custom configuration and default naming.
MRPipeline(Class<?>, String, Configuration) - Constructor for class org.apache.crunch.impl.mr.MRPipeline
Instantiate with a custom name and configuration.
MRPipelineExecution - Interface in org.apache.crunch.impl.mr
 

N

name - Variable in class org.apache.crunch.contrib.io.jdbc.IdentifiableName
 
next() - Method in class org.apache.crunch.contrib.text.Tokenizer
Advance this Tokenizer and return the next String from the Scanner.
nextBoolean() - Method in class org.apache.crunch.contrib.text.Tokenizer
Advance this Tokenizer and return the next Boolean from the Scanner.
nextDouble() - Method in class org.apache.crunch.contrib.text.Tokenizer
Advance this Tokenizer and return the next Double from the Scanner.
nextFloat() - Method in class org.apache.crunch.contrib.text.Tokenizer
Advance this Tokenizer and return the next Float from the Scanner.
nextInt() - Method in class org.apache.crunch.contrib.text.Tokenizer
Advance this Tokenizer and return the next Integer from the Scanner.
nextLong() - Method in class org.apache.crunch.contrib.text.Tokenizer
Advance this Tokenizer and return the next Long from the Scanner.
not(FilterFn<S>) - Static method in class org.apache.crunch.fn.FilterFns
Accept an entry if the given filter does not accept it.
nulls() - Static method in class org.apache.crunch.types.avro.Avros
 
nulls() - Method in class org.apache.crunch.types.avro.AvroTypeFamily
 
nulls() - Method in interface org.apache.crunch.types.PTypeFamily
 
nulls() - Static method in class org.apache.crunch.types.writable.Writables
 
nulls() - Method in class org.apache.crunch.types.writable.WritableTypeFamily
 
numReducers(int) - Method in class org.apache.crunch.GroupingOptions.Builder
 

O

of(T, U) - Static method in class org.apache.crunch.Pair
 
of(A, B, C) - Static method in class org.apache.crunch.Tuple3
 
of(A, B, C, D) - Static method in class org.apache.crunch.Tuple4
 
of(Object...) - Static method in class org.apache.crunch.TupleN
 
OneToManyJoin - Class in org.apache.crunch.lib.join
Optimized join for situations where exactly one value is being joined with any other number of values based on a common key.
OneToManyJoin() - Constructor for class org.apache.crunch.lib.join.OneToManyJoin
 
oneToManyJoin(PTable<K, U>, PTable<K, V>, DoFn<Pair<U, Iterable<V>>, T>, PType<T>) - Static method in class org.apache.crunch.lib.join.OneToManyJoin
Performs a join on two tables, where the left table only contains a single value per key.
oneToManyJoin(PTable<K, U>, PTable<K, V>, DoFn<Pair<U, Iterable<V>>, T>, PType<T>, int) - Static method in class org.apache.crunch.lib.join.OneToManyJoin
Supports a user-specified number of reducers for the one-to-many join.
or(FilterFn<S>, FilterFn<S>) - Static method in class org.apache.crunch.fn.FilterFns
Accept an entry if at least one of the given filters accept it, using short-circuit evaluation.
or(FilterFn<S>...) - Static method in class org.apache.crunch.fn.FilterFns
Accept an entry if at least one of the given filters accept it, using short-circuit evaluation.
order() - Method in class org.apache.crunch.lib.Sort.ColumnOrder
 
org.apache.crunch - package org.apache.crunch
Client-facing API and core abstractions.
org.apache.crunch.contrib - package org.apache.crunch.contrib
User contributions that may be interesting for special applications.
org.apache.crunch.contrib.bloomfilter - package org.apache.crunch.contrib.bloomfilter
Support for creating Bloom Filters.
org.apache.crunch.contrib.io.jdbc - package org.apache.crunch.contrib.io.jdbc
Support for reading data from RDBMS using JDBC
org.apache.crunch.contrib.text - package org.apache.crunch.contrib.text
 
org.apache.crunch.examples - package org.apache.crunch.examples
Example applications demonstrating various aspects of Crunch.
org.apache.crunch.fn - package org.apache.crunch.fn
Commonly used functions for manipulating collections.
org.apache.crunch.impl - package org.apache.crunch.impl
 
org.apache.crunch.impl.mem - package org.apache.crunch.impl.mem
In-memory Pipeline implementation for rapid prototyping and testing.
org.apache.crunch.impl.mr - package org.apache.crunch.impl.mr
A Pipeline implementation that runs on Hadoop MapReduce.
org.apache.crunch.io - package org.apache.crunch.io
Data input and output for Pipelines.
org.apache.crunch.lib - package org.apache.crunch.lib
Joining, sorting, aggregating, and other commonly used functionality.
org.apache.crunch.lib.join - package org.apache.crunch.lib.join
Inner and outer joins on collections.
org.apache.crunch.lib.sort - package org.apache.crunch.lib.sort
 
org.apache.crunch.test - package org.apache.crunch.test
Utilities for testing Crunch-based applications.
org.apache.crunch.types - package org.apache.crunch.types
Common functionality for business object serialization.
org.apache.crunch.types.avro - package org.apache.crunch.types.avro
Business object serialization using Apache Avro.
org.apache.crunch.types.writable - package org.apache.crunch.types.writable
Business object serialization using Hadoop's Writables framework.
org.apache.crunch.util - package org.apache.crunch.util
An assorted set of utilities.
outputConf(String, String) - Method in interface org.apache.crunch.Target
Adds the given key-value pair to the Configuration instance that is used to write this Target.
OutputHandler - Interface in org.apache.crunch.io
 
outputKey(S) - Method in interface org.apache.crunch.types.Converter
 
outputValue(S) - Method in interface org.apache.crunch.types.Converter
 
overridePathProperties(Configuration) - Method in class org.apache.crunch.test.TemporaryPath
Set all keys specified in the constructor to temporary directories.

P

Pair<K,V> - Class in org.apache.crunch
A convenience class for two-element Tuples.
Pair(K, V) - Constructor for class org.apache.crunch.Pair
 
PAIR - Static variable in class org.apache.crunch.types.TupleFactory
 
pairAggregator(Aggregator<V1>, Aggregator<V2>) - Static method in class org.apache.crunch.fn.Aggregators
Apply separate aggregators to each component of a Pair.
PairMapFn<K,V,S,T> - Class in org.apache.crunch.fn
 
PairMapFn(MapFn<K, S>, MapFn<V, T>) - Constructor for class org.apache.crunch.fn.PairMapFn
 
pairs(PType<V1>, PType<V2>) - Static method in class org.apache.crunch.types.avro.Avros
 
pairs(PType<V1>, PType<V2>) - Method in class org.apache.crunch.types.avro.AvroTypeFamily
 
pairs(PType<V1>, PType<V2>) - Method in interface org.apache.crunch.types.PTypeFamily
 
pairs(PType<V1>, PType<V2>) - Static method in class org.apache.crunch.types.writable.Writables
 
pairs(PType<V1>, PType<V2>) - Method in class org.apache.crunch.types.writable.WritableTypeFamily
 
parallelDo(DoFn<S, T>, PType<T>) - Method in interface org.apache.crunch.PCollection
Applies the given doFn to the elements of this PCollection and returns a new PCollection that is the output of this processing.
parallelDo(String, DoFn<S, T>, PType<T>) - Method in interface org.apache.crunch.PCollection
Applies the given doFn to the elements of this PCollection and returns a new PCollection that is the output of this processing.
parallelDo(String, DoFn<S, T>, PType<T>, ParallelDoOptions) - Method in interface org.apache.crunch.PCollection
Applies the given doFn to the elements of this PCollection and returns a new PCollection that is the output of this processing.
parallelDo(DoFn<S, Pair<K, V>>, PTableType<K, V>) - Method in interface org.apache.crunch.PCollection
Similar to the other parallelDo instance, but returns a PTable instance instead of a PCollection.
parallelDo(String, DoFn<S, Pair<K, V>>, PTableType<K, V>) - Method in interface org.apache.crunch.PCollection
Similar to the other parallelDo instance, but returns a PTable instance instead of a PCollection.
parallelDo(String, DoFn<S, Pair<K, V>>, PTableType<K, V>, ParallelDoOptions) - Method in interface org.apache.crunch.PCollection
Similar to the other parallelDo instance, but returns a PTable instance instead of a PCollection.
ParallelDoOptions - Class in org.apache.crunch
Container class that includes optional information about a parallelDo operation applied to a PCollection.
ParallelDoOptions.Builder - Class in org.apache.crunch
 
ParallelDoOptions.Builder() - Constructor for class org.apache.crunch.ParallelDoOptions.Builder
 
Parse - Class in org.apache.crunch.contrib.text
Methods for parsing instances of PCollection<String> into PCollection's of strongly-typed tuples.
parse(String, PCollection<String>, Extractor<T>) - Static method in class org.apache.crunch.contrib.text.Parse
Parses the lines of the input PCollection<String> and returns a PCollection<T> using the given Extractor<T>.
parse(String, PCollection<String>, PTypeFamily, Extractor<T>) - Static method in class org.apache.crunch.contrib.text.Parse
Parses the lines of the input PCollection<String> and returns a PCollection<T> using the given Extractor<T> that uses the given PTypeFamily.
parseTable(String, PCollection<String>, Extractor<Pair<K, V>>) - Static method in class org.apache.crunch.contrib.text.Parse
Parses the lines of the input PCollection<String> and returns a PTable<K, V> using the given Extractor<Pair<K, V>>.
parseTable(String, PCollection<String>, PTypeFamily, Extractor<Pair<K, V>>) - Static method in class org.apache.crunch.contrib.text.Parse
Parses the lines of the input PCollection<String> and returns a PTable<K, V> using the given Extractor<Pair<K, V>> that uses the given PTypeFamily.
PARTITIONER_PATH - Static variable in class org.apache.crunch.lib.sort.TotalOrderPartitioner
 
partitionerClass(Class<? extends Partitioner>) - Method in class org.apache.crunch.GroupingOptions.Builder
 
PartitionUtils - Class in org.apache.crunch.util
Helper functions and settings for determining the number of reducers to use in a pipeline job created by the Crunch planner.
PartitionUtils() - Constructor for class org.apache.crunch.util.PartitionUtils
 
PathTarget - Interface in org.apache.crunch.io
A target whose output goes to a given path on a file system.
PCollection<S> - Interface in org.apache.crunch
A representation of an immutable, distributed collection of elements that is the fundamental target of computations in Crunch.
PGroupedTable<K,V> - Interface in org.apache.crunch
The Crunch representation of a grouped PTable, which corresponds to the output of the shuffle phase of a MapReduce job.
PGroupedTableType<K,V> - Class in org.apache.crunch.types
The PType instance for PGroupedTable instances.
PGroupedTableType(PTableType<K, V>) - Constructor for class org.apache.crunch.types.PGroupedTableType
 
PGroupedTableType.PairIterableMapFn<K,V> - Class in org.apache.crunch.types
 
PGroupedTableType.PairIterableMapFn(MapFn<Object, K>, MapFn<Object, V>) - Constructor for class org.apache.crunch.types.PGroupedTableType.PairIterableMapFn
 
Pipeline - Interface in org.apache.crunch
Manages the state of a pipeline execution.
PipelineExecution - Interface in org.apache.crunch
A handle to allow clients to control a Crunch pipeline as it runs.
PipelineExecution.Status - Enum in org.apache.crunch
 
PipelineResult - Class in org.apache.crunch
Container for the results of a call to run or done on the Pipeline interface that includes details and statistics about the component stages of the data pipeline.
PipelineResult(List<PipelineResult.StageResult>, PipelineExecution.Status) - Constructor for class org.apache.crunch.PipelineResult
 
PipelineResult.StageResult - Class in org.apache.crunch
 
PipelineResult.StageResult(String, Counters) - Constructor for class org.apache.crunch.PipelineResult.StageResult
 
PipelineResult.StageResult(String, String, Counters) - Constructor for class org.apache.crunch.PipelineResult.StageResult
 
plan() - Method in class org.apache.crunch.impl.mr.MRPipeline
 
PObject<T> - Interface in org.apache.crunch
A PObject represents a singleton object value that results from a distributed computation.
process(S, Emitter<Pair<String, BloomFilter>>) - Method in class org.apache.crunch.contrib.bloomfilter.BloomFilterFn
 
process(S, Emitter<T>) - Method in class org.apache.crunch.DoFn
Processes the records from a PCollection.
process(T, Emitter<T>) - Method in class org.apache.crunch.FilterFn
 
process(Pair<Integer, Iterable<Pair<K, V>>>, Emitter<Pair<Integer, Pair<K, V>>>) - Method in class org.apache.crunch.lib.Aggregate.TopKCombineFn
 
process(Pair<K, V>, Emitter<Pair<Integer, Pair<K, V>>>) - Method in class org.apache.crunch.lib.Aggregate.TopKFn
 
process(Pair<Pair<K, Integer>, Iterable<Pair<U, V>>>, Emitter<Pair<K, Pair<U, V>>>) - Method in class org.apache.crunch.lib.join.JoinFn
Split up the input record to make coding a bit more manageable.
process(S, Emitter<T>) - Method in class org.apache.crunch.MapFn
 
Protos - Class in org.apache.crunch.types
Utility functions for working with protocol buffers in Crunch.
Protos() - Constructor for class org.apache.crunch.types.Protos
 
protos(Class<T>, PTypeFamily) - Static method in class org.apache.crunch.types.PTypes
 
PTable<K,V> - Interface in org.apache.crunch
A sub-interface of PCollection that represents an immutable, distributed multi-map of keys and values.
PTables - Class in org.apache.crunch.lib
Methods for performing common operations on PTables.
PTables() - Constructor for class org.apache.crunch.lib.PTables
 
PTableType<K,V> - Interface in org.apache.crunch.types
An extension of PType specifically for PTable objects.
PType<T> - Interface in org.apache.crunch.types
A PType defines a mapping between a data type that is used in a Crunch pipeline and a serialization and storage format that is used to read/write data from/to HDFS.
PTypeFamily - Interface in org.apache.crunch.types
An abstract factory for creating PType instances that have the same serialization/storage backing format.
PTypes - Class in org.apache.crunch.types
Utility functions for creating common types of derived PTypes, e.g., for JSON data, protocol buffers, and Thrift records.
PTypes() - Constructor for class org.apache.crunch.types.PTypes
 
PTypeUtils - Class in org.apache.crunch.types
Utilities for converting between PTypes from different PTypeFamily implementations.

Q

quadAggregator(Aggregator<V1>, Aggregator<V2>, Aggregator<V3>, Aggregator<V4>) - Static method in class org.apache.crunch.fn.Aggregators
Apply separate aggregators to each component of a Tuple4.
quads(PType<V1>, PType<V2>, PType<V3>, PType<V4>) - Static method in class org.apache.crunch.types.avro.Avros
 
quads(PType<V1>, PType<V2>, PType<V3>, PType<V4>) - Method in class org.apache.crunch.types.avro.AvroTypeFamily
 
quads(PType<V1>, PType<V2>, PType<V3>, PType<V4>) - Method in interface org.apache.crunch.types.PTypeFamily
 
quads(PType<V1>, PType<V2>, PType<V3>, PType<V4>) - Static method in class org.apache.crunch.types.writable.Writables
 
quads(PType<V1>, PType<V2>, PType<V3>, PType<V4>) - Method in class org.apache.crunch.types.writable.WritableTypeFamily
 

R

read(Source<T>) - Method in class org.apache.crunch.impl.mem.MemPipeline
 
read(TableSource<K, V>) - Method in class org.apache.crunch.impl.mem.MemPipeline
 
read(Source<S>) - Method in class org.apache.crunch.impl.mr.MRPipeline
 
read(TableSource<K, V>) - Method in class org.apache.crunch.impl.mr.MRPipeline
 
read(FileSystem, Path) - Method in interface org.apache.crunch.io.FileReaderFactory
 
read(Configuration) - Method in interface org.apache.crunch.io.ReadableSource
Returns an Iterable that contains the contents of this source.
read(Source<T>) - Method in interface org.apache.crunch.Pipeline
Converts the given Source into a PCollection that is available to jobs run using this Pipeline instance.
read(TableSource<K, V>) - Method in interface org.apache.crunch.Pipeline
A version of the read method for TableSource instances that map to PTables.
read(TaskInputOutputContext<?, ?, ?, ?>) - Method in interface org.apache.crunch.ReadableData
Read the data referenced by this instance within the given context.
read(Source<T>) - Method in class org.apache.crunch.util.CrunchTool
 
read(TableSource<K, V>) - Method in class org.apache.crunch.util.CrunchTool
 
read(Configuration, Path) - Static method in class org.apache.crunch.util.DistCache
 
ReadableData<T> - Interface in org.apache.crunch
Represents the contents of a data source that can be read on the cluster from within one of the tasks running as part of a Crunch pipeline.
ReadableSource<T> - Interface in org.apache.crunch.io
An extension of the Source interface that indicates that a Source instance may be read as a series of records by the client code.
ReadableSourceTarget<T> - Interface in org.apache.crunch.io
An interface that indicates that a SourceTarget instance can be read into the local client.
readFields(DataInput) - Method in class org.apache.crunch.contrib.io.jdbc.IdentifiableName
 
readFields(ResultSet) - Method in class org.apache.crunch.contrib.io.jdbc.IdentifiableName
 
readFields(DataInput) - Method in class org.apache.crunch.io.FormatBundle
 
readFields(DataInput) - Method in class org.apache.crunch.types.writable.TupleWritable
readTextFile(String) - Method in class org.apache.crunch.impl.mem.MemPipeline
 
readTextFile(String) - Method in class org.apache.crunch.impl.mr.MRPipeline
 
readTextFile(String) - Method in interface org.apache.crunch.Pipeline
A convenience method for reading a text file.
readTextFile(String) - Method in class org.apache.crunch.util.CrunchTool
 
records(Class<T>) - Static method in class org.apache.crunch.types.avro.Avros
 
records(Class<T>) - Method in class org.apache.crunch.types.avro.AvroTypeFamily
 
records(Class<T>) - Method in interface org.apache.crunch.types.PTypeFamily
 
records(Class<T>) - Static method in class org.apache.crunch.types.writable.Writables
 
records(Class<T>) - Method in class org.apache.crunch.types.writable.WritableTypeFamily
 
reduce(PGroupedTable<K1, V1>, Class<? extends Reducer<K1, V1, K2, V2>>, Class<K2>, Class<V2>) - Static method in class org.apache.crunch.lib.Mapred
 
reduce(PGroupedTable<K1, V1>, Class<? extends Reducer<K1, V1, K2, V2>>, Class<K2>, Class<V2>) - Static method in class org.apache.crunch.lib.Mapreduce
 
REFLECT_DATA_FACTORY - Static variable in class org.apache.crunch.types.avro.Avros
The instance we use for generating reflected schemas.
REFLECT_DATA_FACTORY_CLASS - Static variable in class org.apache.crunch.types.avro.Avros
The name of the configuration parameter that tracks which reflection factory to use.
ReflectDataFactory - Class in org.apache.crunch.types.avro
A Factory class for constructing Avro reflection-related objects.
ReflectDataFactory() - Constructor for class org.apache.crunch.types.avro.ReflectDataFactory
 
reflects(Class<T>) - Static method in class org.apache.crunch.types.avro.Avros
 
register(Class<T>, AvroType<T>) - Static method in class org.apache.crunch.types.avro.Avros
 
register(Class<T>, WritableType<T, ? extends Writable>) - Static method in class org.apache.crunch.types.writable.Writables
 
REJECT_ALL() - Static method in class org.apache.crunch.fn.FilterFns
Reject everything.
reservoirSample(PCollection<T>, int) - Static method in class org.apache.crunch.lib.Sample
Select a fixed number of elements from the given PCollection with each element equally likely to be included in the sample.
reservorSample(PCollection<T>, int, Long) - Static method in class org.apache.crunch.lib.Sample
A version of the reservoir sampling algorithm that uses a given seed, primarily for testing purposes.
reset() - Method in interface org.apache.crunch.Aggregator
Clears the internal state of this Aggregator and prepares it for the values associated with the next key.
results() - Method in interface org.apache.crunch.Aggregator
Returns the current aggregated state of this instance.
ReverseAvroComparator<T> - Class in org.apache.crunch.lib.sort
 
ReverseAvroComparator() - Constructor for class org.apache.crunch.lib.sort.ReverseAvroComparator
 
ReverseWritableComparator<T> - Class in org.apache.crunch.lib.sort
 
ReverseWritableComparator() - Constructor for class org.apache.crunch.lib.sort.ReverseWritableComparator
 
rightJoin(PTable<K, U>, PTable<K, V>) - Static method in class org.apache.crunch.lib.Join
Performs a right outer join on the specified PTables.
RightOuterJoinFn<K,U,V> - Class in org.apache.crunch.lib.join
Used to perform the last step of an right outer join.
RightOuterJoinFn(PType<K>, PType<U>) - Constructor for class org.apache.crunch.lib.join.RightOuterJoinFn
 
run(String[]) - Method in class org.apache.crunch.examples.AverageBytesByIP
 
run(String[]) - Method in class org.apache.crunch.examples.SecondarySortExample
 
run(String[]) - Method in class org.apache.crunch.examples.SortExample
 
run(String[]) - Method in class org.apache.crunch.examples.TotalBytesByIP
 
run(String[]) - Method in class org.apache.crunch.examples.WordAggregationHBase
 
run(String[]) - Method in class org.apache.crunch.examples.WordCount
 
run() - Method in class org.apache.crunch.impl.mem.MemPipeline
 
run() - Method in class org.apache.crunch.impl.mr.MRPipeline
 
run() - Method in interface org.apache.crunch.Pipeline
Constructs and executes a series of MapReduce jobs in order to write data to the output targets.
run() - Method in class org.apache.crunch.util.CrunchTool
 
runAsync() - Method in class org.apache.crunch.impl.mem.MemPipeline
 
runAsync() - Method in class org.apache.crunch.impl.mr.MRPipeline
 
runAsync() - Method in interface org.apache.crunch.Pipeline
Constructs and starts a series of MapReduce jobs in order ot write data to the output targets, but returns a ListenableFuture to allow clients to control job execution.
runAsync() - Method in class org.apache.crunch.util.CrunchTool
 

S

Sample - Class in org.apache.crunch.lib
Methods for performing random sampling in a distributed fashion, either by accepting each record in a PCollection with an independent probability in order to sample some fraction of the overall data set, or by using reservoir sampling in order to pull a uniform or weighted sample of fixed size from a PCollection of an unknown size.
Sample() - Constructor for class org.apache.crunch.lib.Sample
 
sample(PCollection<S>, double) - Static method in class org.apache.crunch.lib.Sample
Output records from the given PCollection with the given probability.
sample(PCollection<S>, Long, double) - Static method in class org.apache.crunch.lib.Sample
Output records from the given PCollection using a given seed.
sample(PTable<K, V>, double) - Static method in class org.apache.crunch.lib.Sample
A PTable<K, V> analogue of the sample function.
sample(PTable<K, V>, Long, double) - Static method in class org.apache.crunch.lib.Sample
A PTable<K, V> analogue of the sample function, with the seed argument exposed for testing purposes.
SAMPLE_UNIQUE_ELEMENTS(int) - Static method in class org.apache.crunch.fn.Aggregators
Collect a sample of unique elements from the input, where 'unique' is defined by the equals method for the input objects.
scaleFactor() - Method in class org.apache.crunch.DoFn
Returns an estimate of how applying this function to a PCollection will cause it to change in side.
scaleFactor() - Method in class org.apache.crunch.FilterFn
 
scaleFactor() - Method in class org.apache.crunch.MapFn
 
second() - Method in class org.apache.crunch.Pair
 
second() - Method in class org.apache.crunch.Tuple3
 
second() - Method in class org.apache.crunch.Tuple4
 
SecondarySort - Class in org.apache.crunch.lib
Utilities for performing a secondary sort on a PTable<K, Pair<V1, V2>> collection.
SecondarySort() - Constructor for class org.apache.crunch.lib.SecondarySort
 
SecondarySortExample - Class in org.apache.crunch.examples
 
SecondarySortExample() - Constructor for class org.apache.crunch.examples.SecondarySortExample
 
sequenceFile(String, Class<T>) - Static method in class org.apache.crunch.io.At
Creates a SourceTarget<T> instance from the SequenceFile(s) at the given path name from the value field of each key-value pair in the SequenceFile(s).
sequenceFile(Path, Class<T>) - Static method in class org.apache.crunch.io.At
Creates a SourceTarget<T> instance from the SequenceFile(s) at the given Path from the value field of each key-value pair in the SequenceFile(s).
sequenceFile(String, PType<T>) - Static method in class org.apache.crunch.io.At
Creates a SourceTarget<T> instance from the SequenceFile(s) at the given path name from the value field of each key-value pair in the SequenceFile(s).
sequenceFile(Path, PType<T>) - Static method in class org.apache.crunch.io.At
Creates a SourceTarget<T> instance from the SequenceFile(s) at the given Path from the value field of each key-value pair in the SequenceFile(s).
sequenceFile(String, Class<K>, Class<V>) - Static method in class org.apache.crunch.io.At
Creates a TableSourceTarget<K, V> instance from the SequenceFile(s) at the given path name from the key-value pairs in the SequenceFile(s).
sequenceFile(Path, Class<K>, Class<V>) - Static method in class org.apache.crunch.io.At
Creates a TableSourceTarget<K, V> instance from the SequenceFile(s) at the given Path from the key-value pairs in the SequenceFile(s).
sequenceFile(String, PType<K>, PType<V>) - Static method in class org.apache.crunch.io.At
Creates a TableSourceTarget<K, V> instance from the SequenceFile(s) at the given path name from the key-value pairs in the SequenceFile(s).
sequenceFile(Path, PType<K>, PType<V>) - Static method in class org.apache.crunch.io.At
Creates a TableSourceTarget<K, V> instance from the SequenceFile(s) at the given Path from the key-value pairs in the SequenceFile(s).
sequenceFile(String, Class<T>) - Static method in class org.apache.crunch.io.From
Creates a Source<T> instance from the SequenceFile(s) at the given path name from the value field of each key-value pair in the SequenceFile(s).
sequenceFile(Path, Class<T>) - Static method in class org.apache.crunch.io.From
Creates a Source<T> instance from the SequenceFile(s) at the given Path from the value field of each key-value pair in the SequenceFile(s).
sequenceFile(String, PType<T>) - Static method in class org.apache.crunch.io.From
Creates a Source<T> instance from the SequenceFile(s) at the given path name from the value field of each key-value pair in the SequenceFile(s).
sequenceFile(Path, PType<T>) - Static method in class org.apache.crunch.io.From
Creates a Source<T> instance from the SequenceFile(s) at the given Path from the value field of each key-value pair in the SequenceFile(s).
sequenceFile(String, Class<K>, Class<V>) - Static method in class org.apache.crunch.io.From
Creates a TableSource<K, V> instance for the SequenceFile(s) at the given path name.
sequenceFile(Path, Class<K>, Class<V>) - Static method in class org.apache.crunch.io.From
Creates a TableSource<K, V> instance for the SequenceFile(s) at the given Path.
sequenceFile(String, PType<K>, PType<V>) - Static method in class org.apache.crunch.io.From
Creates a TableSource<K, V> instance for the SequenceFile(s) at the given path name.
sequenceFile(Path, PType<K>, PType<V>) - Static method in class org.apache.crunch.io.From
Creates a TableSource<K, V> instance for the SequenceFile(s) at the given Path.
sequenceFile(String) - Static method in class org.apache.crunch.io.To
Creates a Target at the given path name that writes data to SequenceFiles.
sequenceFile(Path) - Static method in class org.apache.crunch.io.To
Creates a Target at the given Path that writes data to SequenceFiles.
SequentialFileNamingScheme - Class in org.apache.crunch.io
Default FileNamingScheme that uses an incrementing sequence number in order to generate unique file names.
serialize() - Method in class org.apache.crunch.io.FormatBundle
 
set(String, String) - Method in class org.apache.crunch.io.FormatBundle
 
Set - Class in org.apache.crunch.lib
Utilities for performing set operations (difference, intersection, etc) on PCollection instances.
Set() - Constructor for class org.apache.crunch.lib.Set
 
setConf(Configuration) - Method in class org.apache.crunch.io.FormatBundle
 
setConf(Configuration) - Method in class org.apache.crunch.lib.join.JoinUtils.AvroPairGroupingComparator
 
setConf(Configuration) - Method in class org.apache.crunch.lib.sort.ReverseAvroComparator
 
setConf(Configuration) - Method in class org.apache.crunch.lib.sort.ReverseWritableComparator
 
setConf(Configuration) - Method in class org.apache.crunch.lib.sort.TotalOrderPartitioner
 
setConf(Configuration) - Method in class org.apache.crunch.lib.sort.TupleWritableComparator
 
setConf(Configuration) - Method in class org.apache.crunch.util.CrunchTool
 
setConfiguration(Configuration) - Method in class org.apache.crunch.impl.mem.MemPipeline
 
setConfiguration(Configuration) - Method in class org.apache.crunch.impl.mr.MRPipeline
 
setConfiguration(Configuration) - Method in interface org.apache.crunch.Pipeline
Set the Configuration to use with this pipeline.
setContext(TaskInputOutputContext<?, ?, ?, ?>) - Method in class org.apache.crunch.DoFn
Called during setup to pass the TaskInputOutputContext to this DoFn instance.
setContext(TaskInputOutputContext<?, ?, ?, ?>) - Method in class org.apache.crunch.fn.CompositeMapFn
 
setContext(TaskInputOutputContext<?, ?, ?, ?>) - Method in class org.apache.crunch.fn.ExtractKeyFn
 
setContext(TaskInputOutputContext<?, ?, ?, ?>) - Method in class org.apache.crunch.fn.PairMapFn
 
setContext(TaskInputOutputContext<?, ?, ?, ?>) - Method in class org.apache.crunch.types.PGroupedTableType.PairIterableMapFn
 
setPartitionFile(Configuration, Path) - Static method in class org.apache.crunch.lib.sort.TotalOrderPartitioner
 
setWritten(int) - Method in class org.apache.crunch.types.writable.TupleWritable
Record that the tuple contains an element at the position provided.
Shard - Class in org.apache.crunch.lib
Utilities for controlling how the data in a PCollection is balanced across reducers and output files.
Shard() - Constructor for class org.apache.crunch.lib.Shard
 
shard(PCollection<T>, int) - Static method in class org.apache.crunch.lib.Shard
Creates a PCollection<T> that has the same contents as its input argument but will be written to a fixed number of output files.
ShardedJoinStrategy<K,U,V> - Class in org.apache.crunch.lib.join
JoinStrategy that splits the key space up into shards.
ShardedJoinStrategy(int) - Constructor for class org.apache.crunch.lib.join.ShardedJoinStrategy
Instantiate with a constant number of shards to use for all keys.
ShardedJoinStrategy(ShardedJoinStrategy.ShardingStrategy<K>) - Constructor for class org.apache.crunch.lib.join.ShardedJoinStrategy
Instantiate with a custom sharding strategy.
ShardedJoinStrategy.ShardingStrategy<K> - Interface in org.apache.crunch.lib.join
Determines over how many shards a key will be split in a sharded join.
SingleUseIterable<T> - Class in org.apache.crunch.impl
Wrapper around a Reducer's input Iterable.
SingleUseIterable(Iterable<T>) - Constructor for class org.apache.crunch.impl.SingleUseIterable
Instantiate around an Iterable that may only be used once.
size() - Method in class org.apache.crunch.Pair
 
size() - Method in interface org.apache.crunch.Tuple
Returns the number of elements in this Tuple.
size() - Method in class org.apache.crunch.Tuple3
 
size() - Method in class org.apache.crunch.Tuple4
 
size() - Method in class org.apache.crunch.TupleN
 
size() - Method in class org.apache.crunch.types.writable.TupleWritable
The number of children in this Tuple.
skip(String) - Method in class org.apache.crunch.contrib.text.TokenizerFactory.Builder
Sets the regular expression that determines which input characters should be ignored by the Scanner that is returned by the constructed TokenizerFactory.
Sort - Class in org.apache.crunch.lib
Utilities for sorting PCollection instances.
Sort() - Constructor for class org.apache.crunch.lib.Sort
 
sort(PCollection<T>) - Static method in class org.apache.crunch.lib.Sort
Sorts the PCollection using the natural ordering of its elements in ascending order.
sort(PCollection<T>, Sort.Order) - Static method in class org.apache.crunch.lib.Sort
Sorts the PCollection using the natural order of its elements with the given Order.
sort(PCollection<T>, int, Sort.Order) - Static method in class org.apache.crunch.lib.Sort
Sorts the PCollection using the natural ordering of its elements in the order specified using the given number of reducers.
sort(PTable<K, V>) - Static method in class org.apache.crunch.lib.Sort
Sorts the PTable using the natural ordering of its keys in ascending order.
sort(PTable<K, V>, Sort.Order) - Static method in class org.apache.crunch.lib.Sort
Sorts the PTable using the natural ordering of its keys with the given Order.
sort(PTable<K, V>, int, Sort.Order) - Static method in class org.apache.crunch.lib.Sort
Sorts the PTable using the natural ordering of its keys in the order specified with a client-specified number of reducers.
Sort.ColumnOrder - Class in org.apache.crunch.lib
To sort by column 2 ascending then column 1 descending, you would use: sortPairs(coll, by(2, ASCENDING), by(1, DESCENDING)) Column numbering is 1-based.
Sort.ColumnOrder(int, Sort.Order) - Constructor for class org.apache.crunch.lib.Sort.ColumnOrder
 
Sort.Order - Enum in org.apache.crunch.lib
For signaling the order in which a sort should be done.
sortAndApply(PTable<K, Pair<V1, V2>>, DoFn<Pair<K, Iterable<Pair<V1, V2>>>, T>, PType<T>) - Static method in class org.apache.crunch.lib.SecondarySort
Perform a secondary sort on the given PTable instance and then apply a DoFn to the resulting sorted data to yield an output PCollection<T>.
sortAndApply(PTable<K, Pair<V1, V2>>, DoFn<Pair<K, Iterable<Pair<V1, V2>>>, T>, PType<T>, int) - Static method in class org.apache.crunch.lib.SecondarySort
Perform a secondary sort on the given PTable instance and then apply a DoFn to the resulting sorted data to yield an output PCollection<T>, using the given number of reducers.
sortAndApply(PTable<K, Pair<V1, V2>>, DoFn<Pair<K, Iterable<Pair<V1, V2>>>, Pair<U, V>>, PTableType<U, V>) - Static method in class org.apache.crunch.lib.SecondarySort
Perform a secondary sort on the given PTable instance and then apply a DoFn to the resulting sorted data to yield an output PTable<U, V>.
sortAndApply(PTable<K, Pair<V1, V2>>, DoFn<Pair<K, Iterable<Pair<V1, V2>>>, Pair<U, V>>, PTableType<U, V>, int) - Static method in class org.apache.crunch.lib.SecondarySort
Perform a secondary sort on the given PTable instance and then apply a DoFn to the resulting sorted data to yield an output PTable<U, V>, using the given number of reducers.
sortComparatorClass(Class<? extends RawComparator>) - Method in class org.apache.crunch.GroupingOptions.Builder
 
SortExample - Class in org.apache.crunch.examples
Simple Crunch tool for running sorting examples from the command line.
SortExample() - Constructor for class org.apache.crunch.examples.SortExample
 
SortFns - Class in org.apache.crunch.lib.sort
A set of DoFns that are used by Crunch's Sort library.
SortFns() - Constructor for class org.apache.crunch.lib.sort.SortFns
 
SortFns.AvroGenericFn<V extends Tuple> - Class in org.apache.crunch.lib.sort
Pulls a composite set of keys from an Avro GenericRecord instance.
SortFns.AvroGenericFn(int[], Schema) - Constructor for class org.apache.crunch.lib.sort.SortFns.AvroGenericFn
 
SortFns.KeyExtraction<V extends Tuple> - Class in org.apache.crunch.lib.sort
Utility class for encapsulating key extraction logic and serialization information about key extraction.
SortFns.KeyExtraction(PType<V>, Sort.ColumnOrder[]) - Constructor for class org.apache.crunch.lib.sort.SortFns.KeyExtraction
 
SortFns.SingleKeyFn<V extends Tuple,K> - Class in org.apache.crunch.lib.sort
Extracts a single indexed key from a Tuple instance.
SortFns.SingleKeyFn(int) - Constructor for class org.apache.crunch.lib.sort.SortFns.SingleKeyFn
 
SortFns.TupleKeyFn<V extends Tuple,K extends Tuple> - Class in org.apache.crunch.lib.sort
Extracts a composite key from a Tuple instance.
SortFns.TupleKeyFn(int[], TupleFactory) - Constructor for class org.apache.crunch.lib.sort.SortFns.TupleKeyFn
 
sortPairs(PCollection<Pair<U, V>>, Sort.ColumnOrder...) - Static method in class org.apache.crunch.lib.Sort
Sorts the PCollection of Pairs using the specified column ordering.
sortQuads(PCollection<Tuple4<V1, V2, V3, V4>>, Sort.ColumnOrder...) - Static method in class org.apache.crunch.lib.Sort
Sorts the PCollection of Tuple4s using the specified column ordering.
sortTriples(PCollection<Tuple3<V1, V2, V3>>, Sort.ColumnOrder...) - Static method in class org.apache.crunch.lib.Sort
Sorts the PCollection of Tuple3s using the specified column ordering.
sortTuples(PCollection<T>, Sort.ColumnOrder...) - Static method in class org.apache.crunch.lib.Sort
Sorts the PCollection of tuples using the specified column ordering.
sortTuples(PCollection<T>, int, Sort.ColumnOrder...) - Static method in class org.apache.crunch.lib.Sort
Sorts the PCollection of TupleNs using the specified column ordering and a client-specified number of reducers.
Source<T> - Interface in org.apache.crunch
A Source represents an input data set that is an input to one or more MapReduce jobs.
sources(Source<?>...) - Method in class org.apache.crunch.ParallelDoOptions.Builder
 
sources(Collection<Source<?>>) - Method in class org.apache.crunch.ParallelDoOptions.Builder
 
sourceTarget(SourceTarget<?>) - Method in class org.apache.crunch.GroupingOptions.Builder
Deprecated. 
SourceTarget<T> - Interface in org.apache.crunch
An interface for classes that implement both the Source and the Target interfaces.
SourceTargetHelper - Class in org.apache.crunch.io
Functions for configuring the inputs/outputs of MapReduce jobs.
SourceTargetHelper() - Constructor for class org.apache.crunch.io.SourceTargetHelper
 
sourceTargets(SourceTarget<?>...) - Method in class org.apache.crunch.GroupingOptions.Builder
 
sourceTargets(Collection<SourceTarget<?>>) - Method in class org.apache.crunch.GroupingOptions.Builder
 
sourceTargets(SourceTarget<?>...) - Method in class org.apache.crunch.ParallelDoOptions.Builder
 
sourceTargets(Collection<SourceTarget<?>>) - Method in class org.apache.crunch.ParallelDoOptions.Builder
 
specifics(Class<T>) - Static method in class org.apache.crunch.types.avro.Avros
 
split(PCollection<Pair<T, U>>) - Static method in class org.apache.crunch.lib.Channels
Splits a PCollection of any Pair of objects into a Pair of PCollection}, to allow for the output of a DoFn to be handled using separate channels.
split(PCollection<Pair<T, U>>, PType<T>, PType<U>) - Static method in class org.apache.crunch.lib.Channels
Splits a PCollection of any Pair of objects into a Pair of PCollection}, to allow for the output of a DoFn to be handled using separate channels.
status - Variable in class org.apache.crunch.PipelineResult
 
STRING_CONCAT(String, boolean) - Static method in class org.apache.crunch.fn.Aggregators
Concatenate strings, with a separator between strings.
STRING_CONCAT(String, boolean, long, long) - Static method in class org.apache.crunch.fn.Aggregators
Concatenate strings, with a separator between strings.
STRING_TO_UTF8 - Static variable in class org.apache.crunch.types.avro.Avros
 
strings() - Static method in class org.apache.crunch.types.avro.Avros
 
strings() - Method in class org.apache.crunch.types.avro.AvroTypeFamily
 
strings() - Method in interface org.apache.crunch.types.PTypeFamily
 
strings() - Static method in class org.apache.crunch.types.writable.Writables
 
strings() - Method in class org.apache.crunch.types.writable.WritableTypeFamily
 
succeeded() - Method in class org.apache.crunch.PipelineResult
 
SUM_BIGINTS() - Static method in class org.apache.crunch.fn.Aggregators
Sum up all BigInteger values.
SUM_DOUBLES() - Static method in class org.apache.crunch.fn.Aggregators
Sum up all double values.
SUM_FLOATS() - Static method in class org.apache.crunch.fn.Aggregators
Sum up all float values.
SUM_INTS() - Static method in class org.apache.crunch.fn.Aggregators
Sum up all int values.
SUM_LONGS() - Static method in class org.apache.crunch.fn.Aggregators
Sum up all long values.

T

tableOf(S, T, Object...) - Static method in class org.apache.crunch.impl.mem.MemPipeline
 
tableOf(Iterable<Pair<S, T>>) - Static method in class org.apache.crunch.impl.mem.MemPipeline
 
tableOf(PType<K>, PType<V>) - Static method in class org.apache.crunch.types.avro.Avros
 
tableOf(PType<K>, PType<V>) - Method in class org.apache.crunch.types.avro.AvroTypeFamily
 
tableOf(PType<K>, PType<V>) - Method in interface org.apache.crunch.types.PTypeFamily
 
tableOf(PType<K>, PType<V>) - Static method in class org.apache.crunch.types.writable.Writables
 
tableOf(PType<K>, PType<V>) - Method in class org.apache.crunch.types.writable.WritableTypeFamily
 
TableSource<K,V> - Interface in org.apache.crunch
The interface Source implementations that return a PTable.
TableSourceTarget<K,V> - Interface in org.apache.crunch
An interface for classes that implement both the TableSource and the Target interfaces.
Target - Interface in org.apache.crunch
A Target represents the output destination of a Crunch PCollection in the context of a Crunch job.
Target.WriteMode - Enum in org.apache.crunch
An enum to represent different options the client may specify for handling the case where the output path, table, etc.
tempDir - Variable in class org.apache.crunch.test.CrunchTestSupport
 
TemporaryPath - Class in org.apache.crunch.test
Creates a temporary directory for a test case and destroys it afterwards.
TemporaryPath(String...) - Constructor for class org.apache.crunch.test.TemporaryPath
Construct TemporaryPath.
TestCounters - Class in org.apache.crunch.test
A utility class used during unit testing to update and read counters.
TestCounters() - Constructor for class org.apache.crunch.test.TestCounters
 
textFile(String) - Static method in class org.apache.crunch.io.At
Creates a SourceTarget<String> instance for the text file(s) at the given path name.
textFile(Path) - Static method in class org.apache.crunch.io.At
Creates a SourceTarget<String> instance for the text file(s) at the given Path.
textFile(String, PType<T>) - Static method in class org.apache.crunch.io.At
Creates a SourceTarget<T> instance for the text file(s) at the given path name using the provided PType<T> to convert the input text.
textFile(Path, PType<T>) - Static method in class org.apache.crunch.io.At
Creates a SourceTarget<T> instance for the text file(s) at the given Path using the provided PType<T> to convert the input text.
textFile(String) - Static method in class org.apache.crunch.io.From
Creates a Source<String> instance for the text file(s) at the given path name.
textFile(Path) - Static method in class org.apache.crunch.io.From
Creates a Source<String> instance for the text file(s) at the given Path.
textFile(String, PType<T>) - Static method in class org.apache.crunch.io.From
Creates a Source<T> instance for the text file(s) at the given path name using the provided PType<T> to convert the input text.
textFile(Path, PType<T>) - Static method in class org.apache.crunch.io.From
Creates a Source<T> instance for the text file(s) at the given Path using the provided PType<T> to convert the input text.
textFile(String) - Static method in class org.apache.crunch.io.To
Creates a Target at the given path name that writes data to text files.
textFile(Path) - Static method in class org.apache.crunch.io.To
Creates a Target at the given Path that writes data to text files.
third() - Method in class org.apache.crunch.Tuple3
 
third() - Method in class org.apache.crunch.Tuple4
 
thrifts(Class<T>, PTypeFamily) - Static method in class org.apache.crunch.types.PTypes
 
To - Class in org.apache.crunch.io
Static factory methods for creating common Target types.
To() - Constructor for class org.apache.crunch.io.To
 
toCombineFn(Aggregator<V>) - Static method in class org.apache.crunch.fn.Aggregators
Wrap a CombineFn adapter around the given aggregator.
Tokenizer - Class in org.apache.crunch.contrib.text
Manages a Scanner instance and provides support for returning only a subset of the fields returned by the underlying Scanner.
Tokenizer(Scanner, Set<Integer>, boolean) - Constructor for class org.apache.crunch.contrib.text.Tokenizer
Create a new Tokenizer instance.
TokenizerFactory - Class in org.apache.crunch.contrib.text
Factory class that constructs Tokenizer instances for input strings that use a fixed set of delimiters, skip patterns, locales, and sets of indices to keep or drop.
TokenizerFactory.Builder - Class in org.apache.crunch.contrib.text
A class for constructing new TokenizerFactory instances using the Builder pattern.
TokenizerFactory.Builder() - Constructor for class org.apache.crunch.contrib.text.TokenizerFactory.Builder
 
top(PTable<K, V>, int, boolean) - Static method in class org.apache.crunch.lib.Aggregate
 
top(int) - Method in interface org.apache.crunch.PTable
Returns a PTable made up of the pairs in this PTable with the largest value field.
toString() - Method in class org.apache.crunch.lib.Sort.ColumnOrder
 
toString() - Method in class org.apache.crunch.Pair
 
toString() - Method in class org.apache.crunch.Tuple3
 
toString() - Method in class org.apache.crunch.Tuple4
 
toString() - Method in class org.apache.crunch.TupleN
 
toString() - Method in class org.apache.crunch.types.writable.TupleWritable
Convert Tuple to String as in the following.
TotalBytesByIP - Class in org.apache.crunch.examples
 
TotalBytesByIP() - Constructor for class org.apache.crunch.examples.TotalBytesByIP
 
TotalOrderPartitioner<K,V> - Class in org.apache.crunch.lib.sort
A partition-aware Partitioner instance that can work with either Avro or Writable-formatted keys.
TotalOrderPartitioner() - Constructor for class org.apache.crunch.lib.sort.TotalOrderPartitioner
 
tripAggregator(Aggregator<V1>, Aggregator<V2>, Aggregator<V3>) - Static method in class org.apache.crunch.fn.Aggregators
Apply separate aggregators to each component of a Tuple3.
triples(PType<V1>, PType<V2>, PType<V3>) - Static method in class org.apache.crunch.types.avro.Avros
 
triples(PType<V1>, PType<V2>, PType<V3>) - Method in class org.apache.crunch.types.avro.AvroTypeFamily
 
triples(PType<V1>, PType<V2>, PType<V3>) - Method in interface org.apache.crunch.types.PTypeFamily
 
triples(PType<V1>, PType<V2>, PType<V3>) - Static method in class org.apache.crunch.types.writable.Writables
 
triples(PType<V1>, PType<V2>, PType<V3>) - Method in class org.apache.crunch.types.writable.WritableTypeFamily
 
Tuple - Interface in org.apache.crunch
A fixed-size collection of Objects, used in Crunch for representing joins between PCollections.
Tuple3<V1,V2,V3> - Class in org.apache.crunch
A convenience class for three-element Tuples.
Tuple3(V1, V2, V3) - Constructor for class org.apache.crunch.Tuple3
 
TUPLE3 - Static variable in class org.apache.crunch.types.TupleFactory
 
Tuple3.Collect<V1,V2,V3> - Class in org.apache.crunch
 
Tuple3.Collect(Collection<V1>, Collection<V2>, Collection<V3>) - Constructor for class org.apache.crunch.Tuple3.Collect
 
Tuple4<V1,V2,V3,V4> - Class in org.apache.crunch
A convenience class for four-element Tuples.
Tuple4(V1, V2, V3, V4) - Constructor for class org.apache.crunch.Tuple4
 
TUPLE4 - Static variable in class org.apache.crunch.types.TupleFactory
 
Tuple4.Collect<V1,V2,V3,V4> - Class in org.apache.crunch
 
Tuple4.Collect(Collection<V1>, Collection<V2>, Collection<V3>, Collection<V4>) - Constructor for class org.apache.crunch.Tuple4.Collect
 
tupleAggregator(Aggregator<?>...) - Static method in class org.apache.crunch.fn.Aggregators
Apply separate aggregators to each component of a Tuple.
TupleDeepCopier<T extends Tuple> - Class in org.apache.crunch.types
Performs deep copies (based on underlying PType deep copying) of Tuple-based objects.
TupleDeepCopier(Class<T>, PType...) - Constructor for class org.apache.crunch.types.TupleDeepCopier
 
TupleFactory<T extends Tuple> - Class in org.apache.crunch.types
 
TupleFactory() - Constructor for class org.apache.crunch.types.TupleFactory
 
TupleN - Class in org.apache.crunch
A Tuple instance for an arbitrary number of values.
TupleN(Object...) - Constructor for class org.apache.crunch.TupleN
 
TUPLEN - Static variable in class org.apache.crunch.types.TupleFactory
 
tuples(PType...) - Static method in class org.apache.crunch.types.avro.Avros
 
tuples(Class<T>, PType...) - Static method in class org.apache.crunch.types.avro.Avros
 
tuples(PType<?>...) - Method in class org.apache.crunch.types.avro.AvroTypeFamily
 
tuples(Class<T>, PType<?>...) - Method in class org.apache.crunch.types.avro.AvroTypeFamily
 
tuples(PType<?>...) - Method in interface org.apache.crunch.types.PTypeFamily
 
tuples(Class<T>, PType<?>...) - Method in interface org.apache.crunch.types.PTypeFamily
 
tuples(PType...) - Static method in class org.apache.crunch.types.writable.Writables
 
tuples(Class<T>, PType...) - Static method in class org.apache.crunch.types.writable.Writables
 
tuples(PType<?>...) - Method in class org.apache.crunch.types.writable.WritableTypeFamily
 
tuples(Class<T>, PType<?>...) - Method in class org.apache.crunch.types.writable.WritableTypeFamily
 
Tuples - Class in org.apache.crunch.util
Utilities for working with subclasses of the Tuple interface.
Tuples() - Constructor for class org.apache.crunch.util.Tuples
 
Tuples.PairIterable<S,T> - Class in org.apache.crunch.util
 
Tuples.PairIterable(Iterable<S>, Iterable<T>) - Constructor for class org.apache.crunch.util.Tuples.PairIterable
 
Tuples.QuadIterable<A,B,C,D> - Class in org.apache.crunch.util
 
Tuples.QuadIterable(Iterable<A>, Iterable<B>, Iterable<C>, Iterable<D>) - Constructor for class org.apache.crunch.util.Tuples.QuadIterable
 
Tuples.TripIterable<A,B,C> - Class in org.apache.crunch.util
 
Tuples.TripIterable(Iterable<A>, Iterable<B>, Iterable<C>) - Constructor for class org.apache.crunch.util.Tuples.TripIterable
 
Tuples.TupleNIterable - Class in org.apache.crunch.util
 
Tuples.TupleNIterable(Iterable<?>...) - Constructor for class org.apache.crunch.util.Tuples.TupleNIterable
 
TupleWritable - Class in org.apache.crunch.types.writable
A straight copy of the TupleWritable implementation in the join package, added here because of its package visibility restrictions.
TupleWritable() - Constructor for class org.apache.crunch.types.writable.TupleWritable
Create an empty tuple with no allocated storage for writables.
TupleWritable(Writable[]) - Constructor for class org.apache.crunch.types.writable.TupleWritable
Initialize tuple with storage; unknown whether any of them contain "written" values.
TupleWritableComparator - Class in org.apache.crunch.lib.sort
 
TupleWritableComparator() - Constructor for class org.apache.crunch.lib.sort.TupleWritableComparator
 
typedCollectionOf(PType<T>, T...) - Static method in class org.apache.crunch.impl.mem.MemPipeline
 
typedCollectionOf(PType<T>, Iterable<T>) - Static method in class org.apache.crunch.impl.mem.MemPipeline
 
typedTableOf(PTableType<S, T>, S, T, Object...) - Static method in class org.apache.crunch.impl.mem.MemPipeline
 
typedTableOf(PTableType<S, T>, Iterable<Pair<S, T>>) - Static method in class org.apache.crunch.impl.mem.MemPipeline
 

U

ungroup() - Method in interface org.apache.crunch.PGroupedTable
Convert this grouping back into a multimap.
union(PCollection<S>) - Method in interface org.apache.crunch.PCollection
Returns a PCollection instance that acts as the union of this PCollection and the given PCollection.
union(PCollection<S>...) - Method in interface org.apache.crunch.PCollection
Returns a PCollection instance that acts as the union of this PCollection and the input PCollections.
union(PTable<K, V>) - Method in interface org.apache.crunch.PTable
Returns a PTable instance that acts as the union of this PTable and the other PTables.
union(PTable<K, V>...) - Method in interface org.apache.crunch.PTable
Returns a PTable instance that acts as the union of this PTable and the input PTables.
UNIQUE_ELEMENTS() - Static method in class org.apache.crunch.fn.Aggregators
Collect the unique elements of the input, as defined by the equals method for the input objects.
update(T) - Method in interface org.apache.crunch.Aggregator
Incorporate the given value into the aggregate state maintained by this instance.
UTF8_TO_STRING - Static variable in class org.apache.crunch.types.avro.Avros
 
uuid(PTypeFamily) - Static method in class org.apache.crunch.types.PTypes
 

V

valueOf(String) - Static method in enum org.apache.crunch.impl.mr.MRJob.State
Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum org.apache.crunch.lib.join.JoinType
Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum org.apache.crunch.lib.Sort.Order
Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum org.apache.crunch.PipelineExecution.Status
Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum org.apache.crunch.Target.WriteMode
Returns the enum constant of this type with the specified name.
values() - Static method in enum org.apache.crunch.impl.mr.MRJob.State
Returns an array containing the constants of this enum type, in the order they are declared.
values() - Static method in enum org.apache.crunch.lib.join.JoinType
Returns an array containing the constants of this enum type, in the order they are declared.
values(PTable<K, V>) - Static method in class org.apache.crunch.lib.PTables
Extract the values from the given PTable<K, V> as a PCollection<V>.
values() - Static method in enum org.apache.crunch.lib.Sort.Order
Returns an array containing the constants of this enum type, in the order they are declared.
values() - Static method in enum org.apache.crunch.PipelineExecution.Status
Returns an array containing the constants of this enum type, in the order they are declared.
values() - Method in interface org.apache.crunch.PTable
Returns a PCollection made up of the values in this PTable.
values() - Static method in enum org.apache.crunch.Target.WriteMode
Returns an array containing the constants of this enum type, in the order they are declared.

W

waitFor(long, TimeUnit) - Method in interface org.apache.crunch.PipelineExecution
Blocks until pipeline completes or the specified waiting time elapsed.
waitUntilDone() - Method in interface org.apache.crunch.PipelineExecution
Blocks until pipeline completes, i.e.
wasLogged() - Method in exception org.apache.crunch.CrunchRuntimeException
Returns true if this exception was written to the debug logs.
weightedReservoirSample(PCollection<Pair<T, N>>, int) - Static method in class org.apache.crunch.lib.Sample
Selects a weighted sample of the elements of the given PCollection, where the second term in the input Pair is a numerical weight.
weightedReservoirSample(PCollection<Pair<T, N>>, int, Long) - Static method in class org.apache.crunch.lib.Sample
The weighted reservoir sampling function with the seed term exposed for testing purposes.
WordAggregationHBase - Class in org.apache.crunch.examples
You need to have a HBase instance running.
WordAggregationHBase() - Constructor for class org.apache.crunch.examples.WordAggregationHBase
 
WordCount - Class in org.apache.crunch.examples
 
WordCount() - Constructor for class org.apache.crunch.examples.WordCount
 
WritableDeepCopier<T extends org.apache.hadoop.io.Writable> - Class in org.apache.crunch.types.writable
Performs deep copies of Writable values.
WritableDeepCopier(Class<T>) - Constructor for class org.apache.crunch.types.writable.WritableDeepCopier
 
writables(Class<T>) - Static method in class org.apache.crunch.types.avro.Avros
 
Writables - Class in org.apache.crunch.types.writable
Defines static methods that are analogous to the methods defined in WritableTypeFamily for convenient static importing.
writables(Class<W>) - Static method in class org.apache.crunch.types.writable.Writables
 
writables(Class<W>) - Method in class org.apache.crunch.types.writable.WritableTypeFamily
 
WritableType<T,W extends org.apache.hadoop.io.Writable> - Class in org.apache.crunch.types.writable
 
WritableType(Class<T>, Class<W>, MapFn<W, T>, MapFn<T, W>, PType...) - Constructor for class org.apache.crunch.types.writable.WritableType
 
WritableTypeFamily - Class in org.apache.crunch.types.writable
The Writable-based implementation of the PTypeFamily interface.
write(DataOutput) - Method in class org.apache.crunch.contrib.io.jdbc.IdentifiableName
 
write(PreparedStatement) - Method in class org.apache.crunch.contrib.io.jdbc.IdentifiableName
 
write(PCollection<?>, Target) - Method in class org.apache.crunch.impl.mem.MemPipeline
 
write(PCollection<?>, Target, Target.WriteMode) - Method in class org.apache.crunch.impl.mem.MemPipeline
 
write(PCollection<?>, Target) - Method in class org.apache.crunch.impl.mr.MRPipeline
 
write(PCollection<?>, Target, Target.WriteMode) - Method in class org.apache.crunch.impl.mr.MRPipeline
 
write(String, K, V) - Method in class org.apache.crunch.io.CrunchOutputs
 
write(DataOutput) - Method in class org.apache.crunch.io.FormatBundle
 
write(Target) - Method in interface org.apache.crunch.PCollection
Write the contents of this PCollection to the given Target, using the storage format specified by the target.
write(Target, Target.WriteMode) - Method in interface org.apache.crunch.PCollection
Write the contents of this PCollection to the given Target, using the given Target.WriteMode to handle existing targets.
write(PCollection<?>, Target) - Method in interface org.apache.crunch.Pipeline
Write the given collection to the given target on the next pipeline run.
write(PCollection<?>, Target, Target.WriteMode) - Method in interface org.apache.crunch.Pipeline
Write the contents of the PCollection to the given Target, using the storage format specified by the target and the given WriteMode for cases where the referenced Target already exists.
write(Target) - Method in interface org.apache.crunch.PTable
Writes this PTable to the given Target.
write(Target, Target.WriteMode) - Method in interface org.apache.crunch.PTable
Writes this PTable to the given Target, using the given Target.WriteMode to handle existing targets.
write(DataOutput) - Method in class org.apache.crunch.types.writable.TupleWritable
Writes each Writable to out.
write(PCollection<?>, Target) - Method in class org.apache.crunch.util.CrunchTool
 
write(Configuration, Path, Object) - Static method in class org.apache.crunch.util.DistCache
 
writeTextFile(PCollection<T>, String) - Method in class org.apache.crunch.impl.mem.MemPipeline
 
writeTextFile(PCollection<T>, String) - Method in class org.apache.crunch.impl.mr.MRPipeline
 
writeTextFile(PCollection<T>, String) - Method in interface org.apache.crunch.Pipeline
A convenience method for writing a text file.
writeTextFile(PCollection<?>, String) - Method in class org.apache.crunch.util.CrunchTool
 

X

xboolean() - Static method in class org.apache.crunch.contrib.text.Extractors
Returns an Extractor for booleans.
xboolean(Boolean) - Static method in class org.apache.crunch.contrib.text.Extractors
 
xcollect(TokenizerFactory, Extractor<T>) - Static method in class org.apache.crunch.contrib.text.Extractors
 
xcustom(Class<T>, TokenizerFactory, Extractor...) - Static method in class org.apache.crunch.contrib.text.Extractors
Returns an Extractor for a subclass of Tuple with a constructor that has the given extractor types that uses the given TokenizerFactory for parsing the sub-fields.
xdouble() - Static method in class org.apache.crunch.contrib.text.Extractors
Returns an Extractor for doubles.
xdouble(Double) - Static method in class org.apache.crunch.contrib.text.Extractors
 
xfloat() - Static method in class org.apache.crunch.contrib.text.Extractors
Returns an Extractor for floats.
xfloat(Float) - Static method in class org.apache.crunch.contrib.text.Extractors
 
xint() - Static method in class org.apache.crunch.contrib.text.Extractors
Returns an Extractor for integers.
xint(Integer) - Static method in class org.apache.crunch.contrib.text.Extractors
Returns an Extractor for integers.
xlong() - Static method in class org.apache.crunch.contrib.text.Extractors
Returns an Extractor for longs.
xlong(Long) - Static method in class org.apache.crunch.contrib.text.Extractors
Returns an Extractor for longs.
xpair(TokenizerFactory, Extractor<K>, Extractor<V>) - Static method in class org.apache.crunch.contrib.text.Extractors
Returns an Extractor for pairs of the given types that uses the given TokenizerFactory for parsing the sub-fields.
xquad(TokenizerFactory, Extractor<A>, Extractor<B>, Extractor<C>, Extractor<D>) - Static method in class org.apache.crunch.contrib.text.Extractors
Returns an Extractor for quads of the given types that uses the given TokenizerFactory for parsing the sub-fields.
xstring() - Static method in class org.apache.crunch.contrib.text.Extractors
Returns an Extractor for strings.
xstring(String) - Static method in class org.apache.crunch.contrib.text.Extractors
 
xtriple(TokenizerFactory, Extractor<A>, Extractor<B>, Extractor<C>) - Static method in class org.apache.crunch.contrib.text.Extractors
Returns an Extractor for triples of the given types that uses the given TokenizerFactory for parsing the sub-fields.
xtupleN(TokenizerFactory, Extractor...) - Static method in class org.apache.crunch.contrib.text.Extractors
Returns an Extractor for an arbitrary number of types that uses the given TokenizerFactory for parsing the sub-fields.

A B C D E F G H I J K L M N O P Q R S T U V W X

Copyright © 2013 The Apache Software Foundation. All Rights Reserved.