- AbstractCompositeExtractor<T> - Class in org.apache.crunch.contrib.text
-
Base class for Extractor instances that delegates the parsing of fields to other
Extractor instances, primarily used for constructing composite records that implement
the Tuple interface.
- AbstractCompositeExtractor(TokenizerFactory, List<Extractor<?>>) - Constructor for class org.apache.crunch.contrib.text.AbstractCompositeExtractor
-
- AbstractSimpleExtractor<T> - Class in org.apache.crunch.contrib.text
-
Base class for the common case Extractor instances that construct a single
object from a block of text stored in a String, with support for error handling
and reporting.
- accept(T) - Method in class org.apache.crunch.FilterFn
-
If true, emit the given record.
- accept(PCollectionImpl.Visitor) - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
-
- accept(OutputHandler, PType<?>) - Method in interface org.apache.crunch.Target
-
Checks to see if this Target instance is compatible with the
given PType.
- ACCEPT_ALL() - Static method in class org.apache.crunch.fn.FilterFns
-
Accept everything.
- addAccumulator(Map<String, Map<String, Long>>, Map<String, Map<String, Long>>) - Method in class org.apache.crunch.impl.spark.CounterAccumulatorParam
-
- addCacheFile(Path, Configuration) - Static method in class org.apache.crunch.util.DistCache
-
- addInPlace(Map<String, Map<String, Long>>, Map<String, Map<String, Long>>) - Method in class org.apache.crunch.impl.spark.CounterAccumulatorParam
-
- addInputPath(Job, Path, FormatBundle, int) - Static method in class org.apache.crunch.io.CrunchInputs
-
- addInputPaths(Job, Collection<Path>, FormatBundle, int) - Static method in class org.apache.crunch.io.CrunchInputs
-
- addJarDirToDistributedCache(Configuration, File) - Static method in class org.apache.crunch.util.DistCache
-
Adds all jars under the specified directory to the distributed cache of
jobs using the provided configuration.
- addJarDirToDistributedCache(Configuration, String) - Static method in class org.apache.crunch.util.DistCache
-
Adds all jars under the directory at the specified path to the distributed
cache of jobs using the provided configuration.
- addJarToDistributedCache(Configuration, File) - Static method in class org.apache.crunch.util.DistCache
-
Adds the specified jar to the distributed cache of jobs using the provided
configuration.
- addJarToDistributedCache(Configuration, String) - Static method in class org.apache.crunch.util.DistCache
-
Adds the jar at the specified path to the distributed cache of jobs using
the provided configuration.
- addNamedOutput(Job, String, Class<? extends OutputFormat>, Class, Class) - Static method in class org.apache.crunch.io.CrunchOutputs
-
- addNamedOutput(Job, String, FormatBundle<? extends OutputFormat>, Class, Class) - Static method in class org.apache.crunch.io.CrunchOutputs
-
- aggregate(Aggregator<S>) - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
-
- Aggregate - Class in org.apache.crunch.lib
-
Methods for performing various types of aggregations over
PCollection instances.
- Aggregate() - Constructor for class org.apache.crunch.lib.Aggregate
-
- aggregate(PCollection<S>, Aggregator<S>) - Static method in class org.apache.crunch.lib.Aggregate
-
- aggregate(Aggregator<S>) - Method in interface org.apache.crunch.PCollection
-
Returns a PCollection that contains the result of aggregating all values in this instance.
- Aggregate.PairValueComparator<K,V> - Class in org.apache.crunch.lib
-
- Aggregate.PairValueComparator(boolean) - Constructor for class org.apache.crunch.lib.Aggregate.PairValueComparator
-
- Aggregate.TopKCombineFn<K,V> - Class in org.apache.crunch.lib
-
- Aggregate.TopKCombineFn(int, boolean, PType<Pair<K, V>>) - Constructor for class org.apache.crunch.lib.Aggregate.TopKCombineFn
-
- Aggregate.TopKFn<K,V> - Class in org.apache.crunch.lib
-
- Aggregate.TopKFn(int, boolean, PType<Pair<K, V>>) - Constructor for class org.apache.crunch.lib.Aggregate.TopKFn
-
- Aggregator<T> - Interface in org.apache.crunch
-
Aggregate a sequence of values into a possibly smaller sequence of the same type.
- Aggregators - Class in org.apache.crunch.fn
-
- Aggregators.SimpleAggregator<T> - Class in org.apache.crunch.fn
-
Base class for aggregators that do not require any initialization.
- Aggregators.SimpleAggregator() - Constructor for class org.apache.crunch.fn.Aggregators.SimpleAggregator
-
- and(FilterFn<S>, FilterFn<S>) - Static method in class org.apache.crunch.fn.FilterFns
-
Accept an entry if all of the given filters accept it, using short-circuit evaluation.
- and(FilterFn<S>...) - Static method in class org.apache.crunch.fn.FilterFns
-
Accept an entry if all of the given filters accept it, using short-circuit evaluation.
- apply(Statement, Description) - Method in class org.apache.crunch.test.TemporaryPath
-
- applyPTypeTransforms() - Method in interface org.apache.crunch.types.Converter
-
If true, convert the inputs or outputs from this Converter instance
before (for outputs) or after (for inputs) using the associated PType#getInputMapFn
and PType#getOutputMapFn calls.
- as(PType<T>) - Method in class org.apache.crunch.types.avro.AvroTypeFamily
-
- as(PType<T>) - Method in interface org.apache.crunch.types.PTypeFamily
-
Returns the equivalent of the given ptype for this family, if it exists.
- as(PType<T>) - Method in class org.apache.crunch.types.writable.WritableTypeFamily
-
- asCollection() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
- asCollection() - Method in interface org.apache.crunch.PCollection
-
- asMap() - Method in class org.apache.crunch.impl.dist.collect.PTableBase
-
Returns a
PObject encapsulating a
Map made up of the keys and values in this
PTable.
- asMap() - Method in interface org.apache.crunch.PTable
-
Returns a
PObject encapsulating a
Map made up of the keys and values in this
PTable.
- asPTable(PCollection<Pair<K, V>>) - Static method in class org.apache.crunch.lib.PTables
-
Convert the given PCollection<Pair<K, V>> to a PTable<K, V>.
- asReadable(boolean) - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
-
- asReadable() - Method in interface org.apache.crunch.io.ReadableSource
-
- asReadable(boolean) - Method in interface org.apache.crunch.PCollection
-
- asSourceTarget(PType<T>) - Method in interface org.apache.crunch.Target
-
Attempt to create the SourceTarget type that corresponds to this Target
for the given PType, if possible.
- At - Class in org.apache.crunch.io
-
Static factory methods for creating common
SourceTarget types, which may be treated as both a
Source
and a
Target.
- At() - Constructor for class org.apache.crunch.io.At
-
- Average - Class in org.apache.crunch.lib
-
- Average() - Constructor for class org.apache.crunch.lib.Average
-
- AverageBytesByIP - Class in org.apache.crunch.examples
-
- AverageBytesByIP() - Constructor for class org.apache.crunch.examples.AverageBytesByIP
-
- AVRO_MODE_PROPERTY - Static variable in class org.apache.crunch.types.avro.AvroMode
-
- AVRO_SHUFFLE_MODE_PROPERTY - Static variable in class org.apache.crunch.types.avro.AvroMode
-
- AvroDerivedValueDeepCopier<T,S> - Class in org.apache.crunch.types.avro
-
A DeepCopier specific to Avro derived types.
- AvroDerivedValueDeepCopier(MapFn<T, S>, MapFn<S, T>, AvroType<S>) - Constructor for class org.apache.crunch.types.avro.AvroDerivedValueDeepCopier
-
- avroFile(String, Class<T>) - Static method in class org.apache.crunch.io.At
-
Creates a SourceTarget<T> instance from the Avro file(s) at the given path name.
- avroFile(Path, Class<T>) - Static method in class org.apache.crunch.io.At
-
Creates a SourceTarget<T> instance from the Avro file(s) at the given Path.
- avroFile(String) - Static method in class org.apache.crunch.io.At
-
Creates a SourceTarget<GenericData.Record> by reading the schema of the Avro file
at the given path.
- avroFile(Path) - Static method in class org.apache.crunch.io.At
-
Creates a SourceTarget<GenericData.Record> by reading the schema of the Avro file
at the given path.
- avroFile(Path, Configuration) - Static method in class org.apache.crunch.io.At
-
Creates a SourceTarget<GenericData.Record> by reading the schema of the Avro file
at the given path using the FileSystem information contained in the given
Configuration instance.
- avroFile(String, PType<T>) - Static method in class org.apache.crunch.io.At
-
Creates a SourceTarget<T> instance from the Avro file(s) at the given path name.
- avroFile(Path, PType<T>) - Static method in class org.apache.crunch.io.At
-
Creates a SourceTarget<T> instance from the Avro file(s) at the given Path.
- avroFile(String, Class<T>) - Static method in class org.apache.crunch.io.From
-
Creates a Source<T> instance from the Avro file(s) at the given path name.
- avroFile(Path, Class<T>) - Static method in class org.apache.crunch.io.From
-
Creates a Source<T> instance from the Avro file(s) at the given Path.
- avroFile(List<Path>, Class<T>) - Static method in class org.apache.crunch.io.From
-
Creates a Source<T> instance from the Avro file(s) at the given Paths.
- avroFile(String, PType<T>) - Static method in class org.apache.crunch.io.From
-
Creates a Source<T> instance from the Avro file(s) at the given path name.
- avroFile(Path, PType<T>) - Static method in class org.apache.crunch.io.From
-
Creates a Source<T> instance from the Avro file(s) at the given Path.
- avroFile(List<Path>, PType<T>) - Static method in class org.apache.crunch.io.From
-
Creates a Source<T> instance from the Avro file(s) at the given Paths.
- avroFile(String) - Static method in class org.apache.crunch.io.From
-
Creates a Source<GenericData.Record> by reading the schema of the Avro file
at the given path.
- avroFile(Path) - Static method in class org.apache.crunch.io.From
-
Creates a Source<GenericData.Record> by reading the schema of the Avro file
at the given path.
- avroFile(List<Path>) - Static method in class org.apache.crunch.io.From
-
Creates a Source<GenericData.Record> by reading the schema of the Avro file
at the given paths.
- avroFile(Path, Configuration) - Static method in class org.apache.crunch.io.From
-
Creates a Source<GenericData.Record> by reading the schema of the Avro file
at the given path using the FileSystem information contained in the given
Configuration instance.
- avroFile(List<Path>, Configuration) - Static method in class org.apache.crunch.io.From
-
Creates a Source<GenericData.Record> by reading the schema of the Avro file
at the given paths using the FileSystem information contained in the given
Configuration instance.
- avroFile(String) - Static method in class org.apache.crunch.io.To
-
Creates a Target at the given path name that writes data to
Avro files.
- avroFile(Path) - Static method in class org.apache.crunch.io.To
-
Creates a Target at the given Path that writes data to
Avro files.
- AvroInputFormat<T> - Class in org.apache.crunch.types.avro
-
An InputFormat for Avro data files.
- AvroInputFormat() - Constructor for class org.apache.crunch.types.avro.AvroInputFormat
-
- AvroMode - Class in org.apache.crunch.types.avro
-
AvroMode is an immutable object used for configuring the reading and writing of Avro types.
- AvroMode.ModeType - Enum in org.apache.crunch.types.avro
-
Internal enum which represents the various Avro data types.
- AvroOutputFormat<T> - Class in org.apache.crunch.types.avro
-
An OutputFormat for Avro data files.
- AvroOutputFormat() - Constructor for class org.apache.crunch.types.avro.AvroOutputFormat
-
- AvroPathPerKeyOutputFormat<T> - Class in org.apache.crunch.types.avro
-
A FileOutputFormat that takes in a Utf8 and an Avro record and writes the Avro records to
a sub-directory of the output path whose name is equal to the string-form of the Utf8.
- AvroPathPerKeyOutputFormat() - Constructor for class org.apache.crunch.types.avro.AvroPathPerKeyOutputFormat
-
- Avros - Class in org.apache.crunch.types.avro
-
Defines static methods that are analogous to the methods defined in
AvroTypeFamily for convenient static importing.
- AvroSerDe<T> - Class in org.apache.crunch.impl.spark.serde
-
- AvroSerDe(AvroType<T>, Map<String, String>) - Constructor for class org.apache.crunch.impl.spark.serde.AvroSerDe
-
- avroTableFile(Path, PTableType<K, V>) - Static method in class org.apache.crunch.io.From
-
Creates a TableSource<K,V> for reading an Avro key/value file at the given path.
- avroTableFile(List<Path>, PTableType<K, V>) - Static method in class org.apache.crunch.io.From
-
Creates a TableSource<K,V> for reading an Avro key/value file at the given paths.
- AvroTextOutputFormat<K,V> - Class in org.apache.crunch.types.avro
-
- AvroTextOutputFormat() - Constructor for class org.apache.crunch.types.avro.AvroTextOutputFormat
-
- AvroType<T> - Class in org.apache.crunch.types.avro
-
The implementation of the PType interface for Avro-based serialization.
- AvroType(Class<T>, Schema, DeepCopier<T>, PType...) - Constructor for class org.apache.crunch.types.avro.AvroType
-
- AvroType(Class<T>, Schema, MapFn, MapFn, DeepCopier<T>, AvroType.AvroRecordType, PType...) - Constructor for class org.apache.crunch.types.avro.AvroType
-
- AvroType.AvroRecordType - Enum in org.apache.crunch.types.avro
-
- AvroTypeFamily - Class in org.apache.crunch.types.avro
-
- AvroUtf8InputFormat - Class in org.apache.crunch.types.avro
-
An InputFormat for text files.
- AvroUtf8InputFormat() - Constructor for class org.apache.crunch.types.avro.AvroUtf8InputFormat
-
- cache() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
-
- cache(CachingOptions) - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
-
- cache() - Method in class org.apache.crunch.impl.dist.collect.PTableBase
-
- cache(CachingOptions) - Method in class org.apache.crunch.impl.dist.collect.PTableBase
-
- cache(PCollection<T>, CachingOptions) - Method in class org.apache.crunch.impl.mem.MemPipeline
-
- cache(PCollection<T>, CachingOptions) - Method in class org.apache.crunch.impl.mr.MRPipeline
-
- cache(PCollection<T>, CachingOptions) - Method in class org.apache.crunch.impl.spark.SparkPipeline
-
- cache() - Method in interface org.apache.crunch.PCollection
-
- cache(CachingOptions) - Method in interface org.apache.crunch.PCollection
-
Marks this data as cached using the given CachingOptions.
- cache(PCollection<T>, CachingOptions) - Method in interface org.apache.crunch.Pipeline
-
Caches the given PCollection so that it will be processed at most once
during pipeline execution.
- cache() - Method in interface org.apache.crunch.PTable
-
- cache(CachingOptions) - Method in interface org.apache.crunch.PTable
-
- CachingOptions - Class in org.apache.crunch
-
Options for controlling how a PCollection<T> is cached for subsequent processing.
- CachingOptions.Builder - Class in org.apache.crunch
-
A Builder class to use for setting the
CachingOptions for a
PCollection.
- CachingOptions.Builder() - Constructor for class org.apache.crunch.CachingOptions.Builder
-
- call(Tuple2<IntByteArray, List<byte[]>>) - Method in class org.apache.crunch.impl.spark.collect.ToByteArrayFunction
-
- call(Iterator<Tuple2<K, V>>) - Method in class org.apache.crunch.impl.spark.fn.CombineMapsideFunction
-
- call(Iterator<Pair<K, V>>) - Method in class org.apache.crunch.impl.spark.fn.CrunchPairTuple2
-
- call(Integer, Iterator) - Method in class org.apache.crunch.impl.spark.fn.FlatMapIndexFn
-
- call(Iterator<Tuple2<K, V>>) - Method in class org.apache.crunch.impl.spark.fn.FlatMapPairDoFn
-
- call(Tuple2<K, V>) - Method in class org.apache.crunch.impl.spark.fn.InputConverterFunction
-
- call(Object) - Method in class org.apache.crunch.impl.spark.fn.MapFunction
-
- call(Pair<K, V>) - Method in class org.apache.crunch.impl.spark.fn.MapOutputFunction
-
- call(S) - Method in class org.apache.crunch.impl.spark.fn.OutputConverterFunction
-
- call(Iterator<T>) - Method in class org.apache.crunch.impl.spark.fn.PairFlatMapDoFn
-
- call(Tuple2<K, V>) - Method in class org.apache.crunch.impl.spark.fn.PairMapFunction
-
- call(Pair<K, List<V>>) - Method in class org.apache.crunch.impl.spark.fn.PairMapIterableFunction
-
- call(Pair<K, V>) - Method in class org.apache.crunch.impl.spark.fn.PartitionedMapOutputFunction
-
- call(Iterator<Tuple2<ByteArray, List<byte[]>>>) - Method in class org.apache.crunch.impl.spark.fn.ReduceGroupingFunction
-
- call(Tuple2<ByteArray, Iterable<byte[]>>) - Method in class org.apache.crunch.impl.spark.fn.ReduceInputFunction
-
- call(Pair<K, V>) - Method in class org.apache.crunch.impl.spark.fn.Tuple2MapFunction
-
- CAN_COMBINE_SPECIFIC_AND_REFLECT_SCHEMAS - Static variable in class org.apache.crunch.types.avro.Avros
-
Older versions of Avro (i.e., before 1.7.0) do not support schemas that are
composed of a mix of specific and reflection-based schemas.
- Cartesian - Class in org.apache.crunch.lib
-
Utilities for Cartesian products of two PTable or PCollection
instances.
- Cartesian() - Constructor for class org.apache.crunch.lib.Cartesian
-
- Channels - Class in org.apache.crunch.lib
-
- Channels() - Constructor for class org.apache.crunch.lib.Channels
-
- checkCombiningSpecificAndReflectionSchemas() - Static method in class org.apache.crunch.types.avro.Avros
-
- checkOutputSpecs(JobContext) - Static method in class org.apache.crunch.io.CrunchOutputs
-
- cleanup(Emitter<Pair<String, BloomFilter>>) - Method in class org.apache.crunch.contrib.bloomfilter.BloomFilterFn
-
- cleanup(Emitter<T>) - Method in class org.apache.crunch.DoFn
-
Called during the cleanup of the MapReduce job this DoFn is
associated with.
- cleanup(Emitter<T>) - Method in class org.apache.crunch.FilterFn
-
- cleanup() - Method in class org.apache.crunch.FilterFn
-
Called during the cleanup of the MapReduce job this FilterFn is
associated with.
- cleanup(Emitter<T>) - Method in class org.apache.crunch.fn.CompositeMapFn
-
- cleanup(Emitter<Pair<S, T>>) - Method in class org.apache.crunch.fn.PairMapFn
-
- cleanup(boolean) - Method in class org.apache.crunch.impl.dist.DistributedPipeline
-
- cleanup(boolean) - Method in class org.apache.crunch.impl.mem.MemPipeline
-
- cleanup(Emitter<Pair<Integer, Pair<K, V>>>) - Method in class org.apache.crunch.lib.Aggregate.TopKFn
-
- cleanup(Emitter<Pair<K, Pair<U, V>>>) - Method in class org.apache.crunch.lib.join.FullOuterJoinFn
-
Called during the cleanup of the MapReduce job this DoFn is
associated with.
- cleanup(Emitter<Pair<K, Pair<U, V>>>) - Method in class org.apache.crunch.lib.join.LeftOuterJoinFn
-
Called during the cleanup of the MapReduce job this DoFn is
associated with.
- cleanup(boolean) - Method in interface org.apache.crunch.Pipeline
-
Cleans up any artifacts created as a result of
running the pipeline.
- clear() - Method in class org.apache.crunch.types.writable.TupleWritable
-
- clearCounters() - Static method in class org.apache.crunch.impl.mem.MemPipeline
-
- clearCounters() - Static method in class org.apache.crunch.test.TestCounters
-
- close() - Method in class org.apache.crunch.io.CrunchOutputs
-
- cogroup(PTable<K, U>) - Method in class org.apache.crunch.impl.dist.collect.PTableBase
-
- Cogroup - Class in org.apache.crunch.lib
-
- Cogroup() - Constructor for class org.apache.crunch.lib.Cogroup
-
- cogroup(PTable<K, U>, PTable<K, V>) - Static method in class org.apache.crunch.lib.Cogroup
-
Co-groups the two
PTable arguments.
- cogroup(int, PTable<K, U>, PTable<K, V>) - Static method in class org.apache.crunch.lib.Cogroup
-
Co-groups the two
PTable arguments with a user-specified degree of parallelism (a.k.a, number of
reducers.)
- cogroup(PTable<K, V1>, PTable<K, V2>, PTable<K, V3>) - Static method in class org.apache.crunch.lib.Cogroup
-
Co-groups the three
PTable arguments.
- cogroup(int, PTable<K, V1>, PTable<K, V2>, PTable<K, V3>) - Static method in class org.apache.crunch.lib.Cogroup
-
Co-groups the three
PTable arguments with a user-specified degree of parallelism (a.k.a, number of
reducers.)
- cogroup(PTable<K, V1>, PTable<K, V2>, PTable<K, V3>, PTable<K, V4>) - Static method in class org.apache.crunch.lib.Cogroup
-
Co-groups the three
PTable arguments.
- cogroup(int, PTable<K, V1>, PTable<K, V2>, PTable<K, V3>, PTable<K, V4>) - Static method in class org.apache.crunch.lib.Cogroup
-
Co-groups the three
PTable arguments with a user-specified degree of parallelism (a.k.a, number of
reducers.)
- cogroup(PTable<K, ?>, PTable<K, ?>...) - Static method in class org.apache.crunch.lib.Cogroup
-
Co-groups an arbitrary number of
PTable arguments.
- cogroup(int, PTable<K, ?>, PTable<K, ?>...) - Static method in class org.apache.crunch.lib.Cogroup
-
Co-groups an arbitrary number of
PTable arguments with a user-specified degree of parallelism
(a.k.a, number of reducers.) The largest table should come last in the ordering.
- cogroup(PTable<K, U>) - Method in interface org.apache.crunch.PTable
-
Co-group operation with the given table on common keys.
- CollectionDeepCopier<T> - Class in org.apache.crunch.types
-
Performs deep copies (based on underlying PType deep copying) of Collections.
- CollectionDeepCopier(PType<T>) - Constructor for class org.apache.crunch.types.CollectionDeepCopier
-
- collectionOf(T...) - Static method in class org.apache.crunch.impl.mem.MemPipeline
-
- collectionOf(Iterable<T>) - Static method in class org.apache.crunch.impl.mem.MemPipeline
-
- collections(PType<T>) - Static method in class org.apache.crunch.types.avro.Avros
-
- collections(PType<T>) - Method in class org.apache.crunch.types.avro.AvroTypeFamily
-
- collections(PType<T>) - Method in interface org.apache.crunch.types.PTypeFamily
-
- collections(PType<T>) - Static method in class org.apache.crunch.types.writable.Writables
-
- collections(PType<T>) - Method in class org.apache.crunch.types.writable.WritableTypeFamily
-
- collectValues() - Method in class org.apache.crunch.impl.dist.collect.PTableBase
-
- collectValues(PTable<K, V>) - Static method in class org.apache.crunch.lib.Aggregate
-
- collectValues() - Method in interface org.apache.crunch.PTable
-
Aggregate all of the values with the same key into a single key-value pair
in the returned PTable.
- column() - Method in class org.apache.crunch.lib.Sort.ColumnOrder
-
- CombineFn<S,T> - Class in org.apache.crunch
-
A special
DoFn implementation that converts an
Iterable of
values into a single value.
- CombineFn() - Constructor for class org.apache.crunch.CombineFn
-
- CombineMapsideFunction<K,V> - Class in org.apache.crunch.impl.spark.fn
-
- CombineMapsideFunction(CombineFn<K, V>, SparkRuntimeContext) - Constructor for class org.apache.crunch.impl.spark.fn.CombineMapsideFunction
-
- combineValues(CombineFn<K, V>, CombineFn<K, V>) - Method in class org.apache.crunch.impl.dist.collect.BaseGroupedTable
-
- combineValues(CombineFn<K, V>) - Method in class org.apache.crunch.impl.dist.collect.BaseGroupedTable
-
- combineValues(Aggregator<V>) - Method in class org.apache.crunch.impl.dist.collect.BaseGroupedTable
-
- combineValues(Aggregator<V>, Aggregator<V>) - Method in class org.apache.crunch.impl.dist.collect.BaseGroupedTable
-
- combineValues(CombineFn<K, V>) - Method in interface org.apache.crunch.PGroupedTable
-
Combines the values of this grouping using the given CombineFn.
- combineValues(CombineFn<K, V>, CombineFn<K, V>) - Method in interface org.apache.crunch.PGroupedTable
-
Combines and reduces the values of this grouping using the given CombineFn instances.
- combineValues(Aggregator<V>) - Method in interface org.apache.crunch.PGroupedTable
-
Combine the values in each group using the given
Aggregator.
- combineValues(Aggregator<V>, Aggregator<V>) - Method in interface org.apache.crunch.PGroupedTable
-
Combine and reduces the values in each group using the given
Aggregator instances.
- comm(PCollection<T>, PCollection<T>) - Static method in class org.apache.crunch.lib.Set
-
Find the elements that are common to two sets, like the Unix
comm utility.
- compare(ByteArray, ByteArray) - Method in class org.apache.crunch.impl.spark.SparkComparator
-
- compare(Pair<K, V>, Pair<K, V>) - Method in class org.apache.crunch.lib.Aggregate.PairValueComparator
-
- compare(AvroWrapper<T>, AvroWrapper<T>) - Method in class org.apache.crunch.lib.join.JoinUtils.AvroPairGroupingComparator
-
- compare(byte[], int, int, byte[], int, int) - Method in class org.apache.crunch.lib.join.JoinUtils.AvroPairGroupingComparator
-
- compare(TupleWritable, TupleWritable) - Method in class org.apache.crunch.lib.join.JoinUtils.TupleWritableComparator
-
- compare(byte[], int, int, byte[], int, int) - Method in class org.apache.crunch.lib.join.JoinUtils.TupleWritableComparator
-
- compare(AvroKey<T>, AvroKey<T>) - Method in class org.apache.crunch.lib.sort.ReverseAvroComparator
-
- compare(byte[], int, int, byte[], int, int) - Method in class org.apache.crunch.lib.sort.ReverseAvroComparator
-
- compare(byte[], int, int, byte[], int, int) - Method in class org.apache.crunch.lib.sort.ReverseWritableComparator
-
- compare(T, T) - Method in class org.apache.crunch.lib.sort.ReverseWritableComparator
-
- compare(WritableComparable, WritableComparable) - Method in class org.apache.crunch.lib.sort.TupleWritableComparator
-
- compare(byte[], int, int, byte[], int, int) - Method in class org.apache.crunch.types.writable.TupleWritable.Comparator
-
- compare(WritableComparable, WritableComparable) - Method in class org.apache.crunch.types.writable.TupleWritable.Comparator
-
- compareTo(ByteArray) - Method in class org.apache.crunch.impl.spark.ByteArray
-
- compareTo(Pair<K, V>) - Method in class org.apache.crunch.Pair
-
- compareTo(TupleWritable) - Method in class org.apache.crunch.types.writable.TupleWritable
-
- compareTo(UnionWritable) - Method in class org.apache.crunch.types.writable.UnionWritable
-
- CompositeMapFn<R,S,T> - Class in org.apache.crunch.fn
-
- CompositeMapFn(MapFn<R, S>, MapFn<S, T>) - Constructor for class org.apache.crunch.fn.CompositeMapFn
-
- CompositePathIterable<T> - Class in org.apache.crunch.io
-
- Compress - Class in org.apache.crunch.io
-
Helper functions for compressing output data.
- Compress() - Constructor for class org.apache.crunch.io.Compress
-
- compress(T, Class<? extends CompressionCodec>) - Static method in class org.apache.crunch.io.Compress
-
Configure the given output target to be compressed using the given codec.
- conf(String, String) - Method in class org.apache.crunch.GroupingOptions.Builder
-
- conf(String, String) - Method in class org.apache.crunch.ParallelDoOptions.Builder
-
Specifies key-value pairs that should be added to the Configuration object associated with the
Job that includes these options.
- conf(String, String) - Method in interface org.apache.crunch.SourceTarget
-
Adds the given key-value pair to the Configuration instance(s) that are used to
read and write this SourceTarget<T>.
- configure(Configuration) - Method in class org.apache.crunch.DoFn
-
Configure this DoFn.
- configure(Configuration) - Method in class org.apache.crunch.fn.CompositeMapFn
-
- configure(Configuration) - Method in class org.apache.crunch.fn.ExtractKeyFn
-
- configure(Configuration) - Method in class org.apache.crunch.fn.PairMapFn
-
- configure(Job) - Method in class org.apache.crunch.GroupingOptions
-
- configure(Configuration) - Method in class org.apache.crunch.io.FormatBundle
-
- configure(Target, PType<?>) - Method in interface org.apache.crunch.io.OutputHandler
-
- configure(Configuration) - Method in class org.apache.crunch.ParallelDoOptions
-
Applies the key-value pairs that were associated with this instance to the given Configuration
object.
- configure(Configuration) - Method in interface org.apache.crunch.ReadableData
-
Allows this instance to specify any additional configuration settings that may
be needed by the job that it is launched in.
- configure(FormatBundle) - Method in class org.apache.crunch.types.avro.AvroMode
-
Populates the
bundle with mode specific settings for the specific
FormatBundle.
- configure(Configuration) - Method in class org.apache.crunch.types.avro.AvroMode
-
Populates the conf with mode specific settings.
- configure(Configuration) - Method in class org.apache.crunch.types.avro.AvroUtf8InputFormat
-
- configure(Configuration) - Method in class org.apache.crunch.types.PGroupedTableType.PairIterableMapFn
-
- configure(Configuration) - Method in class org.apache.crunch.util.DelegatingReadableData
-
- configure(Configuration) - Method in class org.apache.crunch.util.UnionReadableData
-
- configureFactory(Configuration) - Method in class org.apache.crunch.types.avro.AvroMode
-
- configureForMapReduce(Job, PType<?>, Path, String) - Method in interface org.apache.crunch.io.MapReduceTarget
-
- configureOrdering(Configuration, WritableType[], Sort.ColumnOrder[]) - Static method in class org.apache.crunch.lib.sort.TupleWritableComparator
-
- configureReflectDataFactory(Configuration) - Static method in class org.apache.crunch.types.avro.Avros
-
Deprecated.
as of 0.9.0; use AvroMode.REFLECT.configure(Configuration)
- configureShuffle(Configuration) - Method in class org.apache.crunch.types.avro.AvroMode
-
Populates the conf with mode specific settings for use during the shuffle phase.
- configureShuffle(Job, GroupingOptions) - Method in class org.apache.crunch.types.PGroupedTableType
-
- configureSource(Job, int) - Method in interface org.apache.crunch.Source
-
Configure the given job to use this source as an input.
- containers(Class<T>) - Static method in class org.apache.crunch.types.avro.Avros
-
- containers(Class<T>) - Method in class org.apache.crunch.types.avro.AvroTypeFamily
-
- convert(Object, ObjectInspector, ObjectInspector) - Static method in class org.apache.crunch.types.orc.OrcUtils
-
Convert an object from / to OrcStruct
- convert(PType<T>, PTypeFamily) - Static method in class org.apache.crunch.types.PTypeUtils
-
- Converter<K,V,S,T> - Interface in org.apache.crunch.types
-
Converts the input key/value from a MapReduce task into the input to a
DoFn, or takes the output of a
DoFn and write it to the
output key/values.
- convertInput(K, V) - Method in interface org.apache.crunch.types.Converter
-
- convertIterableInput(K, Iterable<V>) - Method in interface org.apache.crunch.types.Converter
-
- copyResourceFile(String) - Method in class org.apache.crunch.test.TemporaryPath
-
Copy a classpath resource to
File.
- copyResourceFileName(String) - Method in class org.apache.crunch.test.TemporaryPath
-
Copy a classpath resource returning its absolute file name.
- copyResourcePath(String) - Method in class org.apache.crunch.test.TemporaryPath
-
Copy a classpath resource to a Path.
- count() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
-
- count(PCollection<S>) - Static method in class org.apache.crunch.lib.Aggregate
-
Returns a PTable that contains the unique elements of this collection mapped to a count
of their occurrences.
- count(PCollection<S>, int) - Static method in class org.apache.crunch.lib.Aggregate
-
Returns a PTable that contains the unique elements of this collection mapped to a count
of their occurrences.
- count - Variable in class org.apache.crunch.lib.Quantiles.Result
-
- count() - Method in interface org.apache.crunch.PCollection
-
Returns a PTable instance that contains the counts of each unique
element of this PCollection.
- countClause - Variable in class org.apache.crunch.contrib.io.jdbc.DataBaseSource.Builder
-
- CounterAccumulatorParam - Class in org.apache.crunch.impl.spark
-
- CounterAccumulatorParam() - Constructor for class org.apache.crunch.impl.spark.CounterAccumulatorParam
-
- create(String) - Method in class org.apache.crunch.contrib.text.TokenizerFactory
-
Return a Scanner instance that wraps the input string and uses the delimiter,
skip, and locale settings for this TokenizerFactory instance.
- create(Iterable<S>, PType<S>) - Method in class org.apache.crunch.impl.dist.DistributedPipeline
-
- create(Iterable<S>, PType<S>, CreateOptions) - Method in class org.apache.crunch.impl.dist.DistributedPipeline
-
- create(Iterable<Pair<K, V>>, PTableType<K, V>) - Method in class org.apache.crunch.impl.dist.DistributedPipeline
-
- create(Iterable<Pair<K, V>>, PTableType<K, V>, CreateOptions) - Method in class org.apache.crunch.impl.dist.DistributedPipeline
-
- create(Iterable<T>, PType<T>) - Method in class org.apache.crunch.impl.mem.MemPipeline
-
- create(Iterable<T>, PType<T>, CreateOptions) - Method in class org.apache.crunch.impl.mem.MemPipeline
-
- create(Iterable<Pair<K, V>>, PTableType<K, V>) - Method in class org.apache.crunch.impl.mem.MemPipeline
-
- create(Iterable<Pair<K, V>>, PTableType<K, V>, CreateOptions) - Method in class org.apache.crunch.impl.mem.MemPipeline
-
- create(PType<?>, Configuration) - Static method in class org.apache.crunch.impl.spark.serde.SerDeFactory
-
- create(Iterable<S>, PType<S>, CreateOptions) - Method in class org.apache.crunch.impl.spark.SparkPipeline
-
- create(Iterable<Pair<K, V>>, PTableType<K, V>, CreateOptions) - Method in class org.apache.crunch.impl.spark.SparkPipeline
-
- create(FileSystem, Path, FileReaderFactory<S>) - Static method in class org.apache.crunch.io.CompositePathIterable
-
- create() - Static method in class org.apache.crunch.lib.join.MapsideJoinStrategy
-
Create a new MapsideJoinStrategy instance that will load its left-side table into memory,
and will materialize the contents of the left-side table to disk before running the in-memory join.
- create(boolean) - Static method in class org.apache.crunch.lib.join.MapsideJoinStrategy
-
Create a new MapsideJoinStrategy instance that will load its left-side table into memory.
- create(Iterable<T>, PType<T>) - Method in interface org.apache.crunch.Pipeline
-
Creates a PCollection containing the values found in the given Iterable
using an implementation-specific distribution mechanism.
- create(Iterable<T>, PType<T>, CreateOptions) - Method in interface org.apache.crunch.Pipeline
-
Creates a PCollection containing the values found in the given Iterable
using an implementation-specific distribution mechanism.
- create(Iterable<Pair<K, V>>, PTableType<K, V>) - Method in interface org.apache.crunch.Pipeline
-
Creates a PTable containing the values found in the given Iterable
using an implementation-specific distribution mechanism.
- create(Iterable<Pair<K, V>>, PTableType<K, V>, CreateOptions) - Method in interface org.apache.crunch.Pipeline
-
Creates a PTable containing the values found in the given Iterable
using an implementation-specific distribution mechanism.
- create() - Method in class org.apache.crunch.test.TemporaryPath
-
- create() - Static method in class org.apache.crunch.types.NoOpDeepCopier
-
Static factory method.
- create(Object...) - Method in class org.apache.crunch.types.orc.TupleObjectInspector
-
- create(Class<T>, Class...) - Static method in class org.apache.crunch.types.TupleFactory
-
- createBinarySerde(TypeInfo) - Static method in class org.apache.crunch.types.orc.OrcUtils
-
Create a binary serde for OrcStruct serialization/deserialization
- CreatedCollection<T> - Class in org.apache.crunch.impl.spark.collect
-
Represents a Spark-based PCollection that was created from a Java Iterable of
values.
- CreatedCollection(SparkPipeline, Iterable<T>, PType<T>, CreateOptions) - Constructor for class org.apache.crunch.impl.spark.collect.CreatedCollection
-
- createDoCollection(String, PCollectionImpl<S>, DoFn<S, T>, PType<T>, ParallelDoOptions) - Method in interface org.apache.crunch.impl.dist.collect.PCollectionFactory
-
- createDoCollection(String, PCollectionImpl<S>, DoFn<S, T>, PType<T>, ParallelDoOptions) - Method in class org.apache.crunch.impl.spark.collect.SparkCollectFactory
-
- createDoNode() - Method in interface org.apache.crunch.impl.dist.collect.MRCollection
-
- createDoTable(String, PCollectionImpl<S>, DoFn<S, Pair<K, V>>, PTableType<K, V>, ParallelDoOptions) - Method in interface org.apache.crunch.impl.dist.collect.PCollectionFactory
-
- createDoTable(String, PCollectionImpl<S>, CombineFn<K, V>, DoFn<S, Pair<K, V>>, PTableType<K, V>) - Method in interface org.apache.crunch.impl.dist.collect.PCollectionFactory
-
- createDoTable(String, PCollectionImpl<S>, DoFn<S, Pair<K, V>>, PTableType<K, V>, ParallelDoOptions) - Method in class org.apache.crunch.impl.spark.collect.SparkCollectFactory
-
- createDoTable(String, PCollectionImpl<S>, CombineFn<K, V>, DoFn<S, Pair<K, V>>, PTableType<K, V>) - Method in class org.apache.crunch.impl.spark.collect.SparkCollectFactory
-
- CreatedTable<K,V> - Class in org.apache.crunch.impl.spark.collect
-
Represents a Spark-based PTable that was created from a Java Iterable of
key-value pairs.
- CreatedTable(SparkPipeline, Iterable<Pair<K, V>>, PTableType<K, V>, CreateOptions) - Constructor for class org.apache.crunch.impl.spark.collect.CreatedTable
-
- createFilter(Path, BloomFilterFn<String>) - Static method in class org.apache.crunch.contrib.bloomfilter.BloomFilterFactory
-
The method will take an input path and generates BloomFilters for all text
files in that path.
- createFilter(PCollection<T>, BloomFilterFn<T>) - Static method in class org.apache.crunch.contrib.bloomfilter.BloomFilterFactory
-
- createGroupedTable(PTableBase<K, V>, GroupingOptions) - Method in interface org.apache.crunch.impl.dist.collect.PCollectionFactory
-
- createGroupedTable(PTableBase<K, V>, GroupingOptions) - Method in class org.apache.crunch.impl.spark.collect.SparkCollectFactory
-
- createInputCollection(Source<S>, String, DistributedPipeline, ParallelDoOptions) - Method in interface org.apache.crunch.impl.dist.collect.PCollectionFactory
-
- createInputCollection(Source<S>, String, DistributedPipeline, ParallelDoOptions) - Method in class org.apache.crunch.impl.spark.collect.SparkCollectFactory
-
- createInputTable(TableSource<K, V>, String, DistributedPipeline, ParallelDoOptions) - Method in interface org.apache.crunch.impl.dist.collect.PCollectionFactory
-
- createInputTable(TableSource<K, V>, String, DistributedPipeline, ParallelDoOptions) - Method in class org.apache.crunch.impl.spark.collect.SparkCollectFactory
-
- createIntermediateOutput(PType<T>) - Method in class org.apache.crunch.impl.dist.DistributedPipeline
-
- CreateOptions - Class in org.apache.crunch
-
- createOrcStruct(TypeInfo, Object...) - Static method in class org.apache.crunch.types.orc.OrcUtils
-
Create an object of OrcStruct given a type string and a list of objects
- createOrderedTupleSchema(PType<S>, Sort.ColumnOrder[]) - Static method in class org.apache.crunch.lib.sort.SortFns
-
Constructs an Avro schema for the given PType<S> that respects the given column
orderings.
- createPut(PTable<String, String>) - Method in class org.apache.crunch.examples.WordAggregationHBase
-
Create puts in order to insert them in hbase.
- createRecordReader(InputSplit, TaskAttemptContext) - Method in class org.apache.crunch.types.avro.AvroInputFormat
-
- createRecordReader(InputSplit, TaskAttemptContext) - Method in class org.apache.crunch.types.avro.AvroUtf8InputFormat
-
- createSourceTarget(Configuration, Path, Iterable<T>, int) - Method in class org.apache.crunch.types.avro.AvroType
-
- createSourceTarget(Configuration, Path, Iterable<T>, int) - Method in interface org.apache.crunch.types.PType
-
Returns a ReadableSource that contains the data in the given Iterable.
- createSourceTarget(Configuration, Path, Iterable<T>, int) - Method in class org.apache.crunch.types.writable.WritableType
-
- createTempPath() - Method in class org.apache.crunch.impl.dist.DistributedPipeline
-
- createUnionCollection(List<? extends PCollectionImpl<S>>) - Method in interface org.apache.crunch.impl.dist.collect.PCollectionFactory
-
- createUnionCollection(List<? extends PCollectionImpl<S>>) - Method in class org.apache.crunch.impl.spark.collect.SparkCollectFactory
-
- createUnionTable(List<PTableBase<K, V>>) - Method in interface org.apache.crunch.impl.dist.collect.PCollectionFactory
-
- createUnionTable(List<PTableBase<K, V>>) - Method in class org.apache.crunch.impl.spark.collect.SparkCollectFactory
-
- cross(PTable<K1, U>, PTable<K2, V>) - Static method in class org.apache.crunch.lib.Cartesian
-
Performs a full cross join on the specified
PTables (using the same
strategy as Pig's CROSS operator).
- cross(PTable<K1, U>, PTable<K2, V>, int) - Static method in class org.apache.crunch.lib.Cartesian
-
Performs a full cross join on the specified
PTables (using the same
strategy as Pig's CROSS operator).
- cross(PCollection<U>, PCollection<V>) - Static method in class org.apache.crunch.lib.Cartesian
-
Performs a full cross join on the specified
PCollections (using the
same strategy as Pig's CROSS operator).
- cross(PCollection<U>, PCollection<V>, int) - Static method in class org.apache.crunch.lib.Cartesian
-
Performs a full cross join on the specified
PCollections (using the
same strategy as Pig's CROSS operator).
- CRUNCH_DISABLE_OUTPUT_COUNTERS - Static variable in class org.apache.crunch.io.CrunchOutputs
-
- CRUNCH_FILTER_NAME - Static variable in class org.apache.crunch.contrib.bloomfilter.BloomFilterFn
-
- CRUNCH_FILTER_SIZE - Static variable in class org.apache.crunch.contrib.bloomfilter.BloomFilterFn
-
- CRUNCH_INPUTS - Static variable in class org.apache.crunch.io.CrunchInputs
-
- CRUNCH_OUTPUTS - Static variable in class org.apache.crunch.io.CrunchOutputs
-
- CrunchInputs - Class in org.apache.crunch.io
-
Helper functions for configuring multiple InputFormat instances within a single
Crunch MapReduce job.
- CrunchInputs() - Constructor for class org.apache.crunch.io.CrunchInputs
-
- CrunchIterable<S,T> - Class in org.apache.crunch.impl.spark.fn
-
- CrunchIterable(DoFn<S, T>, Iterator<S>) - Constructor for class org.apache.crunch.impl.spark.fn.CrunchIterable
-
- CrunchOutputs<K,V> - Class in org.apache.crunch.io
-
An analogue of
CrunchInputs for handling multiple
OutputFormat instances
writing to multiple files within a single MapReduce job.
- CrunchOutputs(TaskInputOutputContext<?, ?, K, V>) - Constructor for class org.apache.crunch.io.CrunchOutputs
-
Creates and initializes multiple outputs support,
it should be instantiated in the Mapper/Reducer setup method.
- CrunchOutputs(Configuration) - Constructor for class org.apache.crunch.io.CrunchOutputs
-
- CrunchOutputs.OutputConfig<K,V> - Class in org.apache.crunch.io
-
- CrunchOutputs.OutputConfig(FormatBundle<OutputFormat<K, V>>, Class<K>, Class<V>) - Constructor for class org.apache.crunch.io.CrunchOutputs.OutputConfig
-
- CrunchPairTuple2<K,V> - Class in org.apache.crunch.impl.spark.fn
-
- CrunchPairTuple2() - Constructor for class org.apache.crunch.impl.spark.fn.CrunchPairTuple2
-
- CrunchRuntimeException - Exception in org.apache.crunch
-
A RuntimeException implementation that includes some additional options
for the Crunch execution engine to track reporting status.
- CrunchRuntimeException(String) - Constructor for exception org.apache.crunch.CrunchRuntimeException
-
- CrunchRuntimeException(Exception) - Constructor for exception org.apache.crunch.CrunchRuntimeException
-
- CrunchRuntimeException(String, Exception) - Constructor for exception org.apache.crunch.CrunchRuntimeException
-
- CrunchTestSupport - Class in org.apache.crunch.test
-
A temporary workaround for Scala tests to use when working with Rule
annotations until it gets fixed in JUnit 4.11.
- CrunchTestSupport() - Constructor for class org.apache.crunch.test.CrunchTestSupport
-
- CrunchTool - Class in org.apache.crunch.util
-
An extension of the Tool interface that creates a Pipeline
instance and provides methods for working with the Pipeline from inside of
the Tool's run method.
- CrunchTool() - Constructor for class org.apache.crunch.util.CrunchTool
-
- CrunchTool(boolean) - Constructor for class org.apache.crunch.util.CrunchTool
-
- DataBaseSource<T extends org.apache.hadoop.mapreduce.lib.db.DBWritable & org.apache.hadoop.io.Writable> - Class in org.apache.crunch.contrib.io.jdbc
-
Source from reading from a database via a JDBC connection.
- DataBaseSource.Builder<T extends org.apache.hadoop.mapreduce.lib.db.DBWritable & org.apache.hadoop.io.Writable> - Class in org.apache.crunch.contrib.io.jdbc
-
- DataBaseSource.Builder(Class<T>) - Constructor for class org.apache.crunch.contrib.io.jdbc.DataBaseSource.Builder
-
- DebugLogging - Class in org.apache.crunch.test
-
Allows direct manipulation of the Hadoop log4j settings to aid in
unit testing.
- DeepCopier<T> - Interface in org.apache.crunch.types
-
Performs deep copies of values.
- deepCopy(Object) - Method in class org.apache.crunch.types.avro.AvroDerivedValueDeepCopier
-
- deepCopy(Collection<T>) - Method in class org.apache.crunch.types.CollectionDeepCopier
-
- deepCopy(T) - Method in interface org.apache.crunch.types.DeepCopier
-
Create a deep copy of a value.
- deepCopy(Map<String, T>) - Method in class org.apache.crunch.types.MapDeepCopier
-
- deepCopy(T) - Method in class org.apache.crunch.types.NoOpDeepCopier
-
- deepCopy(T) - Method in class org.apache.crunch.types.TupleDeepCopier
-
- deepCopy(Union) - Method in class org.apache.crunch.types.UnionDeepCopier
-
- deepCopy(T) - Method in class org.apache.crunch.types.writable.WritableDeepCopier
-
- DEFAULT - Static variable in class org.apache.crunch.CachingOptions
-
An instance of CachingOptions with the default caching settings.
- DEFAULT_BYTES_PER_REDUCE_TASK - Static variable in class org.apache.crunch.util.PartitionUtils
-
- DEFAULT_MAX_REDUCERS - Static variable in class org.apache.crunch.util.PartitionUtils
-
- DEFAULT_PATH - Static variable in class org.apache.crunch.lib.sort.TotalOrderPartitioner
-
- DefaultJoinStrategy<K,U,V> - Class in org.apache.crunch.lib.join
-
Default join strategy that simply sends all data through the map, shuffle, and reduce phase.
- DefaultJoinStrategy() - Constructor for class org.apache.crunch.lib.join.DefaultJoinStrategy
-
- DefaultJoinStrategy(int) - Constructor for class org.apache.crunch.lib.join.DefaultJoinStrategy
-
- DelegatingReadableData<S,T> - Class in org.apache.crunch.util
-
Implements the ReadableData<T> interface by delegating to an ReadableData<S> instance
and passing its contents through a DoFn<S, T>.
- DelegatingReadableData(ReadableData<S>, DoFn<S, T>) - Constructor for class org.apache.crunch.util.DelegatingReadableData
-
- delete() - Method in class org.apache.crunch.test.TemporaryPath
-
- delimiter(String) - Method in class org.apache.crunch.contrib.text.TokenizerFactory.Builder
-
Sets the delimiter used by the TokenizerFactory instances constructed by
this instance.
- dependsOn(String, Target) - Method in class org.apache.crunch.PipelineCallable
-
Requires that the given Target exists before this instance may be
executed.
- dependsOn(String, PCollection<?>) - Method in class org.apache.crunch.PipelineCallable
-
Requires that the given PCollection be materialized to disk before this instance may be
executed.
- derived(PType<V1>, PType<V2>, PType<V3>) - Static method in class org.apache.crunch.Tuple3.Collect
-
- derived(PType<V1>, PType<V2>, PType<V3>, PType<V4>) - Static method in class org.apache.crunch.Tuple4.Collect
-
- derived(Class<T>, MapFn<S, T>, MapFn<T, S>, PType<S>) - Static method in class org.apache.crunch.types.avro.Avros
-
- derived(Class<T>, MapFn<S, T>, MapFn<T, S>, PType<S>) - Method in class org.apache.crunch.types.avro.AvroTypeFamily
-
- derived(Class<T>, MapFn<S, T>, MapFn<T, S>, PType<S>) - Method in interface org.apache.crunch.types.PTypeFamily
-
- derived(Class<T>, MapFn<S, T>, MapFn<T, S>, PType<S>) - Static method in class org.apache.crunch.types.writable.Writables
-
- derived(Class<T>, MapFn<S, T>, MapFn<T, S>, PType<S>) - Method in class org.apache.crunch.types.writable.WritableTypeFamily
-
- derivedImmutable(Class<T>, MapFn<S, T>, MapFn<T, S>, PType<S>) - Static method in class org.apache.crunch.types.avro.Avros
-
- derivedImmutable(Class<T>, MapFn<S, T>, MapFn<T, S>, PType<S>) - Method in class org.apache.crunch.types.avro.AvroTypeFamily
-
- derivedImmutable(Class<T>, MapFn<S, T>, MapFn<T, S>, PType<S>) - Method in interface org.apache.crunch.types.PTypeFamily
-
A derived type whose values are immutable.
- derivedImmutable(Class<T>, MapFn<S, T>, MapFn<T, S>, PType<S>) - Static method in class org.apache.crunch.types.writable.Writables
-
- derivedImmutable(Class<T>, MapFn<S, T>, MapFn<T, S>, PType<S>) - Method in class org.apache.crunch.types.writable.WritableTypeFamily
-
- deserialized(boolean) - Method in class org.apache.crunch.CachingOptions.Builder
-
- deserialized() - Method in class org.apache.crunch.CachingOptions
-
Whether the data should remain deserialized in the cache, which trades off CPU processing time
for additional storage overhead.
- detach(DoFn<Pair<K, Iterable<V>>, T>, PType<V>) - Static method in class org.apache.crunch.lib.DoFns
-
"Reduce" DoFn wrapper which detaches the values in the iterable, preventing the unexpected behaviour related to
object reuse often observed when using Avro.
- difference(PCollection<T>, PCollection<T>) - Static method in class org.apache.crunch.lib.Set
-
Compute the set difference between two sets of elements.
- disableDeepCopy() - Method in class org.apache.crunch.DoFn
-
By default, Crunch will do a defensive deep copy of the outputs of a
DoFn when there are multiple downstream consumers of that item, in order to
prevent the downstream functions from making concurrent modifications to
data objects.
- DistCache - Class in org.apache.crunch.util
-
Provides functions for working with Hadoop's distributed cache.
- DistCache() - Constructor for class org.apache.crunch.util.DistCache
-
- Distinct - Class in org.apache.crunch.lib
-
Functions for computing the distinct elements of a PCollection.
- distinct(PCollection<S>) - Static method in class org.apache.crunch.lib.Distinct
-
Construct a new PCollection that contains the unique elements of a
given input PCollection.
- distinct(PTable<K, V>) - Static method in class org.apache.crunch.lib.Distinct
-
A PTable<K, V> analogue of the distinct function.
- distinct(PCollection<S>, int) - Static method in class org.apache.crunch.lib.Distinct
-
A distinct operation that gives the client more control over how frequently
elements are flushed to disk in order to allow control over performance or
memory consumption.
- distinct(PTable<K, V>, int) - Static method in class org.apache.crunch.lib.Distinct
-
A PTable<K, V> analogue of the distinct function.
- distributed(PTable<K, V>, double, double...) - Static method in class org.apache.crunch.lib.Quantiles
-
Calculate a set of quantiles for each key in a numerically-valued table.
- DistributedPipeline - Class in org.apache.crunch.impl.dist
-
- DistributedPipeline(String, Configuration, PCollectionFactory) - Constructor for class org.apache.crunch.impl.dist.DistributedPipeline
-
Instantiate with a custom name and configuration.
- DoCollection<S> - Class in org.apache.crunch.impl.spark.collect
-
- DoFn<S,T> - Class in org.apache.crunch
-
Base class for all data processing functions in Crunch.
- DoFn() - Constructor for class org.apache.crunch.DoFn
-
- DoFnIterator<S,T> - Class in org.apache.crunch.util
-
An Iterator<T> that combines a delegate Iterator<S> and a DoFn<S, T>, generating
data by passing the contents of the iterator through the function.
- DoFnIterator(Iterator<S>, DoFn<S, T>) - Constructor for class org.apache.crunch.util.DoFnIterator
-
- DoFns - Class in org.apache.crunch.lib
-
- DoFns() - Constructor for class org.apache.crunch.lib.DoFns
-
- done() - Method in class org.apache.crunch.impl.dist.DistributedPipeline
-
- done() - Method in class org.apache.crunch.impl.mem.MemPipeline
-
- done() - Method in class org.apache.crunch.impl.spark.SparkPipeline
-
- done() - Method in interface org.apache.crunch.Pipeline
-
Run any remaining jobs required to generate outputs and then clean up any
intermediate data files that were created in this run or previous calls to
run.
- DONE - Static variable in class org.apache.crunch.PipelineResult
-
- done() - Method in class org.apache.crunch.util.CrunchTool
-
- DoTable<K,V> - Class in org.apache.crunch.impl.spark.collect
-
- doubles() - Static method in class org.apache.crunch.types.avro.Avros
-
- doubles() - Method in class org.apache.crunch.types.avro.AvroTypeFamily
-
- doubles() - Method in interface org.apache.crunch.types.PTypeFamily
-
- doubles() - Static method in class org.apache.crunch.types.writable.Writables
-
- doubles() - Method in class org.apache.crunch.types.writable.WritableTypeFamily
-
- drop(Integer...) - Method in class org.apache.crunch.contrib.text.TokenizerFactory.Builder
-
Drop the specified fields found by the input scanner, counting from zero.
- FileNamingScheme - Interface in org.apache.crunch.io
-
Encapsulates rules for naming output files.
- FileReaderFactory<T> - Interface in org.apache.crunch.io
-
- filter(FilterFn<S>) - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
-
- filter(String, FilterFn<S>) - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
-
- filter(FilterFn<Pair<K, V>>) - Method in class org.apache.crunch.impl.dist.collect.PTableBase
-
- filter(String, FilterFn<Pair<K, V>>) - Method in class org.apache.crunch.impl.dist.collect.PTableBase
-
- filter(FilterFn<S>) - Method in interface org.apache.crunch.PCollection
-
Apply the given filter function to this instance and return the resulting
PCollection.
- filter(String, FilterFn<S>) - Method in interface org.apache.crunch.PCollection
-
Apply the given filter function to this instance and return the resulting
PCollection.
- filter(FilterFn<Pair<K, V>>) - Method in interface org.apache.crunch.PTable
-
Apply the given filter function to this instance and return the resulting
PTable.
- filter(String, FilterFn<Pair<K, V>>) - Method in interface org.apache.crunch.PTable
-
Apply the given filter function to this instance and return the resulting
PTable.
- FilterFn<T> - Class in org.apache.crunch
-
A
DoFn for the common case of filtering the members of a
PCollection based on a boolean condition.
- FilterFn() - Constructor for class org.apache.crunch.FilterFn
-
- FilterFns - Class in org.apache.crunch.fn
-
A collection of pre-defined
FilterFn implementations.
- findContainingJar(Class<?>) - Static method in class org.apache.crunch.util.DistCache
-
Finds the path to a jar that contains the class provided, if any.
- findCounter(Enum<?>) - Method in class org.apache.crunch.PipelineResult.StageResult
-
- first() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
-
- first() - Method in class org.apache.crunch.Pair
-
- first() - Method in interface org.apache.crunch.PCollection
-
- first() - Method in class org.apache.crunch.Tuple3
-
- first() - Method in class org.apache.crunch.Tuple4
-
- FIRST_N(int) - Static method in class org.apache.crunch.fn.Aggregators
-
Return the first n values (or fewer if there are fewer values than n).
- FlatMapIndexFn<S,T> - Class in org.apache.crunch.impl.spark.fn
-
- FlatMapIndexFn(DoFn<S, T>, boolean, SparkRuntimeContext) - Constructor for class org.apache.crunch.impl.spark.fn.FlatMapIndexFn
-
- FlatMapPairDoFn<K,V,T> - Class in org.apache.crunch.impl.spark.fn
-
- FlatMapPairDoFn(DoFn<Pair<K, V>, T>, SparkRuntimeContext) - Constructor for class org.apache.crunch.impl.spark.fn.FlatMapPairDoFn
-
- floats() - Static method in class org.apache.crunch.types.avro.Avros
-
- floats() - Method in class org.apache.crunch.types.avro.AvroTypeFamily
-
- floats() - Method in interface org.apache.crunch.types.PTypeFamily
-
- floats() - Static method in class org.apache.crunch.types.writable.Writables
-
- floats() - Method in class org.apache.crunch.types.writable.WritableTypeFamily
-
- flush() - Method in interface org.apache.crunch.Emitter
-
Flushes any values cached by this emitter.
- forAvroSchema(Schema) - Static method in class org.apache.crunch.impl.spark.ByteArrayHelper
-
- forInput(Class<T>) - Static method in class org.apache.crunch.io.FormatBundle
-
- FormatBundle<K> - Class in org.apache.crunch.io
-
A combination of an InputFormat or OutputFormat and any extra
configuration information that format class needs to run.
- FormatBundle() - Constructor for class org.apache.crunch.io.FormatBundle
-
- formattedFile(String, Class<? extends FileInputFormat<K, V>>, Class<K>, Class<V>) - Static method in class org.apache.crunch.io.From
-
Creates a TableSource<K, V> for reading data from files that have custom
FileInputFormat<K, V> implementations not covered by the provided TableSource
and Source factory methods.
- formattedFile(Path, Class<? extends FileInputFormat<K, V>>, Class<K>, Class<V>) - Static method in class org.apache.crunch.io.From
-
Creates a TableSource<K, V> for reading data from files that have custom
FileInputFormat<K, V> implementations not covered by the provided TableSource
and Source factory methods.
- formattedFile(List<Path>, Class<? extends FileInputFormat<K, V>>, Class<K>, Class<V>) - Static method in class org.apache.crunch.io.From
-
Creates a TableSource<K, V> for reading data from files that have custom
FileInputFormat<K, V> implementations not covered by the provided TableSource
and Source factory methods.
- formattedFile(String, Class<? extends FileInputFormat<?, ?>>, PType<K>, PType<V>) - Static method in class org.apache.crunch.io.From
-
Creates a TableSource<K, V> for reading data from files that have custom
FileInputFormat implementations not covered by the provided TableSource
and Source factory methods.
- formattedFile(Path, Class<? extends FileInputFormat<?, ?>>, PType<K>, PType<V>) - Static method in class org.apache.crunch.io.From
-
Creates a TableSource<K, V> for reading data from files that have custom
FileInputFormat implementations not covered by the provided TableSource
and Source factory methods.
- formattedFile(List<Path>, Class<? extends FileInputFormat<?, ?>>, PType<K>, PType<V>) - Static method in class org.apache.crunch.io.From
-
Creates a TableSource<K, V> for reading data from files that have custom
FileInputFormat implementations not covered by the provided TableSource
and Source factory methods.
- formattedFile(String, Class<? extends FileOutputFormat<K, V>>) - Static method in class org.apache.crunch.io.To
-
Creates a Target at the given path name that writes data to
a custom FileOutputFormat.
- formattedFile(Path, Class<? extends FileOutputFormat<K, V>>) - Static method in class org.apache.crunch.io.To
-
Creates a Target at the given Path that writes data to
a custom FileOutputFormat.
- forOutput(Class<T>) - Static method in class org.apache.crunch.io.FormatBundle
-
- fourth() - Method in class org.apache.crunch.Tuple4
-
- From - Class in org.apache.crunch.io
-
Static factory methods for creating common
Source types.
- From() - Constructor for class org.apache.crunch.io.From
-
- fromBytes(byte[]) - Method in class org.apache.crunch.impl.spark.serde.AvroSerDe
-
- fromBytes(byte[]) - Method in interface org.apache.crunch.impl.spark.serde.SerDe
-
- fromBytes(byte[]) - Method in class org.apache.crunch.impl.spark.serde.WritableSerDe
-
- fromBytesFunction() - Method in class org.apache.crunch.impl.spark.serde.AvroSerDe
-
- fromBytesFunction() - Method in interface org.apache.crunch.impl.spark.serde.SerDe
-
- fromBytesFunction() - Method in class org.apache.crunch.impl.spark.serde.WritableSerDe
-
- fromConfiguration(Configuration) - Static method in class org.apache.crunch.types.avro.AvroMode
-
- fromSerialized(String, Configuration) - Static method in class org.apache.crunch.io.FormatBundle
-
- fromShuffleConfiguration(Configuration) - Static method in class org.apache.crunch.types.avro.AvroMode
-
- fromType(AvroType<?>) - Static method in class org.apache.crunch.types.avro.AvroMode
-
Creates an
AvroMode based upon the specified
type.
- fullJoin(PTable<K, U>, PTable<K, V>) - Static method in class org.apache.crunch.lib.Join
-
Performs a full outer join on the specified
PTables.
- FullOuterJoinFn<K,U,V> - Class in org.apache.crunch.lib.join
-
Used to perform the last step of an full outer join.
- FullOuterJoinFn(PType<K>, PType<U>) - Constructor for class org.apache.crunch.lib.join.FullOuterJoinFn
-
- generateKeys(S) - Method in class org.apache.crunch.contrib.bloomfilter.BloomFilterFn
-
- generateOutput(Pipeline) - Method in class org.apache.crunch.PipelineCallable
-
Called by the Pipeline when this instance is registered with Pipeline#sequentialDo.
- GENERIC - Static variable in class org.apache.crunch.types.avro.AvroMode
-
Default mode to use for reading and writing Generic types.
- generics(Schema) - Static method in class org.apache.crunch.types.avro.Avros
-
- generics(Schema) - Method in class org.apache.crunch.types.avro.AvroTypeFamily
-
- get() - Method in class org.apache.crunch.impl.spark.SparkRuntime
-
- get(long, TimeUnit) - Method in class org.apache.crunch.impl.spark.SparkRuntime
-
- get(int) - Method in class org.apache.crunch.Pair
-
- get(int) - Method in interface org.apache.crunch.Tuple
-
Returns the Object at the given index.
- get(int) - Method in class org.apache.crunch.Tuple3
-
- get(int) - Method in class org.apache.crunch.Tuple4
-
- get(int) - Method in class org.apache.crunch.TupleN
-
- get(int) - Method in class org.apache.crunch.types.writable.TupleWritable
-
Get ith Writable from Tuple.
- getAllPCollections() - Method in class org.apache.crunch.PipelineCallable
-
Returns the mapping of labels to PCollection dependencies for this instance.
- getAllStructFieldRefs() - Method in class org.apache.crunch.types.orc.TupleObjectInspector
-
- getAllTargets() - Method in class org.apache.crunch.PipelineCallable
-
Returns the mapping of labels to Target dependencies for this instance.
- getByFn() - Method in class org.apache.crunch.lib.sort.SortFns.KeyExtraction
-
- getCategory() - Method in class org.apache.crunch.types.orc.TupleObjectInspector
-
- getCombineFn() - Method in class org.apache.crunch.impl.spark.SparkRuntime
-
- getConf() - Method in class org.apache.crunch.io.FormatBundle
-
- getConf() - Method in class org.apache.crunch.lib.sort.TotalOrderPartitioner
-
- getConf() - Method in class org.apache.crunch.lib.sort.TupleWritableComparator
-
- getConf() - Method in class org.apache.crunch.types.writable.TupleWritable.Comparator
-
- getConf() - Method in class org.apache.crunch.util.CrunchTool
-
- getConfiguration() - Method in class org.apache.crunch.impl.dist.DistributedPipeline
-
- getConfiguration() - Method in class org.apache.crunch.impl.mem.MemPipeline
-
- getConfiguration() - Method in class org.apache.crunch.impl.spark.SparkRuntime
-
- getConfiguration() - Method in class org.apache.crunch.impl.spark.SparkRuntimeContext
-
- getConfiguration() - Method in interface org.apache.crunch.Pipeline
-
Returns the Configuration instance associated with this pipeline.
- getConverter() - Method in interface org.apache.crunch.Source
-
Returns the Converter used for mapping the inputs from this instance
into PCollection or PTable values.
- getConverter(PType<?>) - Method in interface org.apache.crunch.Target
-
Returns the Converter to use for mapping from the output PCollection
into the output values expected by this instance.
- getConverter() - Method in class org.apache.crunch.types.avro.AvroType
-
- getConverter() - Method in class org.apache.crunch.types.PGroupedTableType
-
- getConverter() - Method in interface org.apache.crunch.types.PType
-
- getConverter() - Method in class org.apache.crunch.types.writable.WritableType
-
- getCounter(Enum<?>) - Static method in class org.apache.crunch.test.TestCounters
-
- getCounter(String, String) - Static method in class org.apache.crunch.test.TestCounters
-
- getCounter() - Method in class org.apache.hadoop.mapred.SparkCounter
-
- getCounterDisplayName(String, String) - Method in class org.apache.crunch.PipelineResult.StageResult
-
- getCounterDisplayName(Enum<?>) - Method in class org.apache.crunch.PipelineResult.StageResult
-
- getCounterNames() - Method in class org.apache.crunch.PipelineResult.StageResult
-
- getCounters() - Static method in class org.apache.crunch.impl.mem.MemPipeline
-
- getCounters() - Method in class org.apache.crunch.PipelineResult.StageResult
-
- getCounterValue(String, String) - Method in class org.apache.crunch.PipelineResult.StageResult
-
- getCounterValue(Enum<?>) - Method in class org.apache.crunch.PipelineResult.StageResult
-
- getData() - Method in class org.apache.crunch.types.avro.AvroMode
-
Returns a GenericData instance based on the mode type.
- getData() - Method in interface org.apache.crunch.types.avro.ReaderWriterFactory
-
- getData() - Method in class org.apache.crunch.types.avro.ReflectDataFactory
-
- getDataFileWriter(Path, Configuration) - Static method in class org.apache.crunch.types.avro.AvroOutputFormat
-
- getDefaultConfiguration() - Method in class org.apache.crunch.test.TemporaryPath
-
- getDefaultFileSource(Path) - Method in class org.apache.crunch.types.avro.AvroType
-
- getDefaultFileSource(Path) - Method in class org.apache.crunch.types.PGroupedTableType
-
- getDefaultFileSource(Path) - Method in interface org.apache.crunch.types.PType
-
Returns a SourceTarget that is able to read/write data using the serialization format
specified by this PType.
- getDefaultFileSource(Path) - Method in class org.apache.crunch.types.writable.WritableType
-
- getDefaultInstance() - Static method in class org.apache.crunch.contrib.text.TokenizerFactory
-
Returns a default TokenizerFactory that uses whitespace as a delimiter and does
not skip any input fields.
- getDefaultInstance(Class<M>) - Static method in class org.apache.crunch.types.Protos
-
Utility function for creating a default PB Messgae from a Class object that
works with both protoc 2.3.0 and 2.4.x.
- getDefaultValue() - Method in class org.apache.crunch.contrib.text.AbstractSimpleExtractor
-
- getDefaultValue() - Method in interface org.apache.crunch.contrib.text.Extractor
-
Returns the default value for this Extractor in case of an
error.
- getDependentJobs() - Method in interface org.apache.crunch.impl.mr.MRJob
-
- getDepth() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
-
- getDetachedValue(PTableType<K, V>, Pair<K, V>) - Static method in class org.apache.crunch.lib.PTables
-
Create a detached value for a table
Pair.
- getDetachedValue(T) - Method in class org.apache.crunch.types.avro.AvroType
-
- getDetachedValue(T) - Method in interface org.apache.crunch.types.PType
-
Returns a copy of a value (or the value itself) that can safely be retained.
- getDetachedValue(T) - Method in class org.apache.crunch.types.writable.WritableType
-
- getDisplayName() - Method in class org.apache.hadoop.mapred.SparkCounter
-
- getEndTimeMsec() - Method in class org.apache.crunch.PipelineResult.StageResult
-
- getErrorCount() - Method in class org.apache.crunch.contrib.text.ExtractorStats
-
The overall number of records that had some kind of parsing error.
- getFactory() - Method in class org.apache.crunch.impl.dist.DistributedPipeline
-
- getFactory() - Method in class org.apache.crunch.types.avro.AvroMode
-
Returns the factory that will be used for the mode.
- getFamily() - Method in class org.apache.crunch.types.avro.AvroType
-
- getFamily() - Method in class org.apache.crunch.types.PGroupedTableType
-
- getFamily() - Method in interface org.apache.crunch.types.PType
-
Returns the PTypeFamily that this PType belongs to.
- getFamily() - Method in class org.apache.crunch.types.writable.WritableType
-
- getFieldErrors() - Method in class org.apache.crunch.contrib.text.ExtractorStats
-
Returns the number of errors that occurred when parsing the individual fields of
a composite record type, like a Pair or TupleN.
- getFile(String) - Method in class org.apache.crunch.test.TemporaryPath
-
Get a
File below the temporary directory.
- getFileName(String) - Method in class org.apache.crunch.test.TemporaryPath
-
Get an absolute file name below the temporary directory.
- getFileNamingScheme() - Method in interface org.apache.crunch.io.PathTarget
-
Get the naming scheme to be used for outputs being written to an output
path.
- getFirst() - Method in class org.apache.crunch.fn.CompositeMapFn
-
- getFormatClass() - Method in class org.apache.crunch.io.FormatBundle
-
- getFormatNodeMap(JobContext) - Static method in class org.apache.crunch.io.CrunchInputs
-
- getGroupedDetachedValue(PGroupedTableType<K, V>, Pair<K, Iterable<V>>) - Static method in class org.apache.crunch.lib.PTables
-
- getGroupedTableType() - Method in class org.apache.crunch.impl.dist.collect.BaseGroupedTable
-
- getGroupedTableType() - Method in interface org.apache.crunch.PGroupedTable
-
Return the PGroupedTableType containing serialization information for
this PGroupedTable.
- getGroupedTableType() - Method in interface org.apache.crunch.types.PTableType
-
Returns the grouped table version of this type.
- getGroupingComparator(PTypeFamily) - Static method in class org.apache.crunch.lib.join.JoinUtils
-
- getGroupingComparatorClass() - Method in class org.apache.crunch.GroupingOptions
-
- getGroupingConverter() - Method in class org.apache.crunch.types.PGroupedTableType
-
- getIndex() - Method in class org.apache.crunch.types.writable.UnionWritable
-
- getIndex() - Method in class org.apache.crunch.Union
-
Returns the index of the original data source for this union type.
- getInputMapFn() - Method in class org.apache.crunch.types.avro.AvroType
-
- getInputMapFn() - Method in interface org.apache.crunch.types.PType
-
- getInputMapFn() - Method in class org.apache.crunch.types.writable.WritableType
-
- getInstance() - Static method in class org.apache.crunch.fn.IdentityFn
-
- getInstance() - Static method in class org.apache.crunch.impl.mem.MemPipeline
-
- getInstance() - Static method in class org.apache.crunch.io.SequentialFileNamingScheme
-
- getInstance() - Static method in class org.apache.crunch.types.avro.AvroTypeFamily
-
- getInstance() - Static method in class org.apache.crunch.types.writable.TupleWritable.Comparator
-
- getInstance() - Static method in class org.apache.crunch.types.writable.WritableTypeFamily
-
- getJavaRDDLike(SparkRuntime) - Method in class org.apache.crunch.impl.spark.collect.CreatedCollection
-
- getJavaRDDLike(SparkRuntime) - Method in class org.apache.crunch.impl.spark.collect.CreatedTable
-
- getJavaRDDLike(SparkRuntime) - Method in class org.apache.crunch.impl.spark.collect.DoCollection
-
- getJavaRDDLike(SparkRuntime) - Method in class org.apache.crunch.impl.spark.collect.DoTable
-
- getJavaRDDLike(SparkRuntime) - Method in class org.apache.crunch.impl.spark.collect.EmptyPCollection
-
- getJavaRDDLike(SparkRuntime) - Method in class org.apache.crunch.impl.spark.collect.EmptyPTable
-
- getJavaRDDLike(SparkRuntime) - Method in class org.apache.crunch.impl.spark.collect.InputCollection
-
- getJavaRDDLike(SparkRuntime) - Method in class org.apache.crunch.impl.spark.collect.InputTable
-
- getJavaRDDLike(SparkRuntime) - Method in class org.apache.crunch.impl.spark.collect.PGroupedTableImpl
-
- getJavaRDDLike(SparkRuntime) - Method in class org.apache.crunch.impl.spark.collect.UnionCollection
-
- getJavaRDDLike(SparkRuntime) - Method in class org.apache.crunch.impl.spark.collect.UnionTable
-
- getJavaRDDLike(SparkRuntime) - Method in interface org.apache.crunch.impl.spark.SparkCollection
-
- getJob() - Method in interface org.apache.crunch.impl.mr.MRJob
-
- getJobEndTimeMsec() - Method in class org.apache.crunch.PipelineResult.StageResult
-
- getJobID() - Method in interface org.apache.crunch.impl.mr.MRJob
-
- getJobs() - Method in interface org.apache.crunch.impl.mr.MRPipelineExecution
-
- getJobStartTimeMsec() - Method in class org.apache.crunch.PipelineResult.StageResult
-
- getJobState() - Method in interface org.apache.crunch.impl.mr.MRJob
-
- getJoinType() - Method in class org.apache.crunch.lib.join.FullOuterJoinFn
- getJoinType() - Method in class org.apache.crunch.lib.join.InnerJoinFn
-
- getJoinType() - Method in class org.apache.crunch.lib.join.JoinFn
-
- getJoinType() - Method in class org.apache.crunch.lib.join.LeftOuterJoinFn
- getJoinType() - Method in class org.apache.crunch.lib.join.RightOuterJoinFn
- getKeyClass() - Method in interface org.apache.crunch.types.Converter
-
- getKeyType() - Method in class org.apache.crunch.impl.dist.collect.PTableBase
-
- getKeyType() - Method in class org.apache.crunch.lib.sort.SortFns.KeyExtraction
-
- getKeyType() - Method in interface org.apache.crunch.PTable
-
Returns the PType of the key.
- getKeyType() - Method in interface org.apache.crunch.types.PTableType
-
Returns the key type for the table.
- getLastModifiedAt(Configuration) - Method in class org.apache.crunch.contrib.io.jdbc.DataBaseSource
-
- getLastModifiedAt() - Method in class org.apache.crunch.impl.dist.collect.BaseDoCollection
-
- getLastModifiedAt() - Method in class org.apache.crunch.impl.dist.collect.BaseDoTable
-
- getLastModifiedAt() - Method in class org.apache.crunch.impl.dist.collect.BaseGroupedTable
-
- getLastModifiedAt() - Method in class org.apache.crunch.impl.dist.collect.BaseInputCollection
-
- getLastModifiedAt() - Method in class org.apache.crunch.impl.dist.collect.BaseInputTable
-
- getLastModifiedAt() - Method in class org.apache.crunch.impl.dist.collect.BaseUnionCollection
-
- getLastModifiedAt() - Method in class org.apache.crunch.impl.dist.collect.BaseUnionTable
-
- getLastModifiedAt() - Method in class org.apache.crunch.impl.dist.collect.EmptyPCollection
-
- getLastModifiedAt() - Method in class org.apache.crunch.impl.dist.collect.EmptyPTable
-
- getLastModifiedAt() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
-
The time of the most recent modification to one of the input sources to the collection.
- getLastModifiedAt() - Method in class org.apache.crunch.impl.spark.collect.CreatedCollection
-
- getLastModifiedAt() - Method in class org.apache.crunch.impl.spark.collect.CreatedTable
-
- getLastModifiedAt(FileSystem, Path) - Static method in class org.apache.crunch.io.SourceTargetHelper
-
- getLastModifiedAt(Configuration) - Method in interface org.apache.crunch.Source
-
Returns the time (in milliseconds) that this Source was most recently
modified (e.g., because an input file was edited or new files were added to
a directory.)
- getMapOutputName(Configuration, Path) - Method in interface org.apache.crunch.io.FileNamingScheme
-
Get the output file name for a map task.
- getMapOutputName(Configuration, Path) - Method in class org.apache.crunch.io.SequentialFileNamingScheme
-
- getMaterializedAt() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
-
- getMaterializeSourceTarget(PCollection<T>) - Method in class org.apache.crunch.impl.dist.DistributedPipeline
-
Retrieve a ReadableSourceTarget that provides access to the contents of a
PCollection.
- getMessage() - Method in class org.apache.crunch.PipelineCallable
-
Returns a message associated with this callable's execution, especially in case of errors.
- getModeProperties() - Method in class org.apache.crunch.types.avro.AvroMode
-
Returns the entries that a Configuration instance needs to enable
this AvroMode as a serializable map of key-value pairs.
- getName() - Method in class org.apache.crunch.CreateOptions
-
- getName() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
-
- getName() - Method in class org.apache.crunch.impl.dist.DistributedPipeline
-
- getName() - Method in class org.apache.crunch.impl.mem.MemPipeline
-
- getName() - Method in class org.apache.crunch.io.FormatBundle
-
- getName() - Method in interface org.apache.crunch.PCollection
-
Returns a shorthand name for this PCollection.
- getName() - Method in interface org.apache.crunch.Pipeline
-
Returns the name of this pipeline.
- getName() - Method in class org.apache.crunch.PipelineCallable
-
Returns the name of this instance.
- getName() - Method in class org.apache.hadoop.mapred.SparkCounter
-
- getNamedDotFiles() - Method in class org.apache.crunch.impl.spark.SparkRuntime
-
- getNamedDotFiles() - Method in interface org.apache.crunch.PipelineExecution
-
Returns all .dot files that allows a client to graph the Crunch execution plan internals.
- getNamedOutputs(Configuration) - Static method in class org.apache.crunch.io.CrunchOutputs
-
- getNextAnonymousStageId() - Method in class org.apache.crunch.impl.dist.DistributedPipeline
-
- getNumReducers() - Method in class org.apache.crunch.GroupingOptions
-
- getNumShards(K) - Method in interface org.apache.crunch.lib.join.ShardedJoinStrategy.ShardingStrategy
-
Retrieve the number of shards over which the given key should be split.
- getOnlyParent() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
-
- getOutputCommitter(TaskAttemptContext) - Static method in class org.apache.crunch.io.CrunchOutputs
-
- getOutputMapFn() - Method in class org.apache.crunch.types.avro.AvroType
-
- getOutputMapFn() - Method in interface org.apache.crunch.types.PType
-
- getOutputMapFn() - Method in class org.apache.crunch.types.writable.WritableType
-
- getParallelDoOptions() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
-
- getParallelism() - Method in class org.apache.crunch.CreateOptions
-
- getParents() - Method in class org.apache.crunch.impl.dist.collect.BaseDoCollection
-
- getParents() - Method in class org.apache.crunch.impl.dist.collect.BaseDoTable
-
- getParents() - Method in class org.apache.crunch.impl.dist.collect.BaseGroupedTable
-
- getParents() - Method in class org.apache.crunch.impl.dist.collect.BaseInputCollection
-
- getParents() - Method in class org.apache.crunch.impl.dist.collect.BaseInputTable
-
- getParents() - Method in class org.apache.crunch.impl.dist.collect.BaseUnionCollection
-
- getParents() - Method in class org.apache.crunch.impl.dist.collect.BaseUnionTable
-
- getParents() - Method in class org.apache.crunch.impl.dist.collect.EmptyPCollection
-
- getParents() - Method in class org.apache.crunch.impl.dist.collect.EmptyPTable
-
- getParents() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
-
- getParents() - Method in class org.apache.crunch.impl.spark.collect.CreatedCollection
-
- getParents() - Method in class org.apache.crunch.impl.spark.collect.CreatedTable
-
- getPartition(Object) - Method in class org.apache.crunch.impl.spark.SparkPartitioner
-
- getPartition(Object, Object, int) - Method in class org.apache.crunch.lib.join.JoinUtils.AvroIndexedRecordPartitioner
-
- getPartition(TupleWritable, Writable, int) - Method in class org.apache.crunch.lib.join.JoinUtils.TupleWritablePartitioner
-
- getPartition(K, V, int) - Method in class org.apache.crunch.lib.sort.TotalOrderPartitioner
-
- getPartitionerClass() - Method in class org.apache.crunch.GroupingOptions
-
- getPartitionerClass(PTypeFamily) - Static method in class org.apache.crunch.lib.join.JoinUtils
-
- getPartitionFile(Configuration) - Static method in class org.apache.crunch.lib.sort.TotalOrderPartitioner
-
- getPath() - Method in interface org.apache.crunch.io.PathTarget
-
- getPath(String) - Method in class org.apache.crunch.test.TemporaryPath
-
Get a Path below the temporary directory.
- getPathSize(Configuration, Path) - Static method in class org.apache.crunch.io.SourceTargetHelper
-
- getPathSize(FileSystem, Path) - Static method in class org.apache.crunch.io.SourceTargetHelper
-
- getPathToCacheFile(Path, Configuration) - Static method in class org.apache.crunch.util.DistCache
-
- getPipeline() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
-
- getPipeline() - Method in interface org.apache.crunch.PCollection
-
Returns the Pipeline associated with this PCollection.
- getPlanDotFile() - Method in class org.apache.crunch.impl.spark.SparkRuntime
-
- getPlanDotFile() - Method in interface org.apache.crunch.PipelineExecution
-
Returns the .dot file that allows a client to graph the Crunch execution plan for this
pipeline.
- getPrimitiveType(Class<T>) - Static method in class org.apache.crunch.types.avro.Avros
-
- getPrimitiveType(Class<T>) - Static method in class org.apache.crunch.types.writable.Writables
-
- getPTableType() - Method in class org.apache.crunch.impl.dist.collect.BaseDoTable
-
- getPTableType() - Method in class org.apache.crunch.impl.dist.collect.BaseInputTable
-
- getPTableType() - Method in class org.apache.crunch.impl.dist.collect.BaseUnionTable
-
- getPTableType() - Method in class org.apache.crunch.impl.dist.collect.EmptyPTable
-
- getPTableType() - Method in class org.apache.crunch.impl.spark.collect.CreatedTable
-
- getPTableType() - Method in interface org.apache.crunch.PTable
-
Returns the PTableType of this PTable.
- getPType(PTypeFamily) - Method in interface org.apache.crunch.contrib.text.Extractor
-
Returns the PType associated with this data type for the
given PTypeFamily.
- getPType() - Method in class org.apache.crunch.impl.dist.collect.BaseDoCollection
-
- getPType() - Method in class org.apache.crunch.impl.dist.collect.BaseDoTable
-
- getPType() - Method in class org.apache.crunch.impl.dist.collect.BaseGroupedTable
-
- getPType() - Method in class org.apache.crunch.impl.dist.collect.BaseInputCollection
-
- getPType() - Method in class org.apache.crunch.impl.dist.collect.BaseInputTable
-
- getPType() - Method in class org.apache.crunch.impl.dist.collect.BaseUnionCollection
-
- getPType() - Method in class org.apache.crunch.impl.dist.collect.BaseUnionTable
-
- getPType() - Method in class org.apache.crunch.impl.dist.collect.EmptyPCollection
-
- getPType() - Method in class org.apache.crunch.impl.dist.collect.EmptyPTable
-
- getPType() - Method in class org.apache.crunch.impl.spark.collect.CreatedCollection
-
- getPType() - Method in class org.apache.crunch.impl.spark.collect.CreatedTable
-
- getPType() - Method in interface org.apache.crunch.PCollection
-
Returns the PType of this PCollection.
- getReader(Schema) - Method in class org.apache.crunch.types.avro.AvroMode
-
Creates a DatumReader based on the schema.
- getReader(Schema) - Method in interface org.apache.crunch.types.avro.ReaderWriterFactory
-
- getReader(Schema) - Method in class org.apache.crunch.types.avro.ReflectDataFactory
-
- getRecommendedPartitions(PCollection<T>) - Static method in class org.apache.crunch.util.PartitionUtils
-
- getRecommendedPartitions(PCollection<T>, Configuration) - Static method in class org.apache.crunch.util.PartitionUtils
-
- getRecordType() - Method in class org.apache.crunch.types.avro.AvroType
-
- getRecordWriter(TaskAttemptContext) - Method in class org.apache.crunch.types.avro.AvroOutputFormat
-
- getRecordWriter(TaskAttemptContext) - Method in class org.apache.crunch.types.avro.AvroPathPerKeyOutputFormat
-
- getRecordWriter(TaskAttemptContext) - Method in class org.apache.crunch.types.avro.AvroTextOutputFormat
-
- getReduceOutputName(Configuration, Path, int) - Method in interface org.apache.crunch.io.FileNamingScheme
-
Get the output file name for a reduce task.
- getReduceOutputName(Configuration, Path, int) - Method in class org.apache.crunch.io.SequentialFileNamingScheme
-
- getReflectDataFactory(Configuration) - Static method in class org.apache.crunch.types.avro.Avros
-
Deprecated.
as of 0.9.0; use AvroMode.fromConfiguration(conf)
- getResult() - Method in class org.apache.crunch.impl.spark.SparkRuntime
-
- getResult() - Method in interface org.apache.crunch.PipelineExecution
-
Retrieve the result of a pipeline if it has been completed, otherwise null.
- getRootFile() - Method in class org.apache.crunch.test.TemporaryPath
-
Get the root directory which will be deleted automatically.
- getRootFileName() - Method in class org.apache.crunch.test.TemporaryPath
-
Get the root directory as an absolute file name.
- getRootPath() - Method in class org.apache.crunch.test.TemporaryPath
-
Get the root directory as a Path.
- getRuntimeContext() - Method in class org.apache.crunch.impl.spark.SparkRuntime
-
- getSchema() - Method in class org.apache.crunch.types.avro.AvroType
-
- getSecond() - Method in class org.apache.crunch.fn.CompositeMapFn
-
- getSerializationClass() - Method in class org.apache.crunch.types.writable.WritableType
-
- getSize(Configuration) - Method in class org.apache.crunch.contrib.io.jdbc.DataBaseSource
-
- getSize() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
-
- getSize() - Method in interface org.apache.crunch.PCollection
-
Returns the size of the data represented by this PCollection in
bytes.
- getSize(Configuration) - Method in interface org.apache.crunch.Source
-
Returns the number of bytes in this Source.
- getSortComparatorClass() - Method in class org.apache.crunch.GroupingOptions
-
- getSource() - Method in class org.apache.crunch.impl.dist.collect.BaseInputCollection
-
- getSource() - Method in class org.apache.crunch.impl.dist.collect.BaseInputTable
-
- getSourceTargets() - Method in class org.apache.crunch.GroupingOptions
-
- getSourceTargets() - Method in class org.apache.crunch.ParallelDoOptions
-
Deprecated.
- getSourceTargets() - Method in interface org.apache.crunch.ReadableData
-
- getSourceTargets() - Method in class org.apache.crunch.util.DelegatingReadableData
-
- getSourceTargets() - Method in class org.apache.crunch.util.UnionReadableData
-
- getSparkContext() - Method in class org.apache.crunch.impl.spark.SparkRuntime
-
- getSpecificClassLoader() - Static method in class org.apache.crunch.types.avro.AvroMode
-
Get the configured ClassLoader to be used for loading Avro org.apache.specific.SpecificRecord
and reflection implementation classes.
- getStageId() - Method in class org.apache.crunch.PipelineResult.StageResult
-
- getStageName() - Method in class org.apache.crunch.PipelineResult.StageResult
-
- getStageResults() - Method in class org.apache.crunch.PipelineResult
-
- getStartTimeMsec() - Method in class org.apache.crunch.PipelineResult.StageResult
-
- getStats() - Method in class org.apache.crunch.contrib.text.AbstractCompositeExtractor
-
- getStats() - Method in class org.apache.crunch.contrib.text.AbstractSimpleExtractor
-
- getStats() - Method in interface org.apache.crunch.contrib.text.Extractor
-
Return statistics about how many errors this Extractor instance
encountered while parsing input data.
- getStatus() - Method in class org.apache.crunch.impl.spark.SparkRuntime
-
- getStatus() - Method in interface org.apache.crunch.PipelineExecution
-
- getStorageLevel(PCollection<?>) - Method in class org.apache.crunch.impl.spark.SparkRuntime
-
- getStructFieldData(Object, StructField) - Method in class org.apache.crunch.types.orc.TupleObjectInspector
-
- getStructFieldRef(String) - Method in class org.apache.crunch.types.orc.TupleObjectInspector
-
- getStructFieldsDataAsList(Object) - Method in class org.apache.crunch.types.orc.TupleObjectInspector
-
- getSubTypes() - Method in class org.apache.crunch.types.avro.AvroType
-
- getSubTypes() - Method in class org.apache.crunch.types.PGroupedTableType
-
- getSubTypes() - Method in interface org.apache.crunch.types.PType
-
Returns the sub-types that make up this PType if it is a composite instance, such as a tuple.
- getSubTypes() - Method in class org.apache.crunch.types.writable.WritableType
-
- getTableType() - Method in interface org.apache.crunch.TableSource
-
- getTableType() - Method in class org.apache.crunch.types.PGroupedTableType
-
- getTargetDependencies() - Method in class org.apache.crunch.impl.dist.collect.BaseGroupedTable
-
- getTargetDependencies() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
-
- getTargets() - Method in class org.apache.crunch.ParallelDoOptions
-
- getTestContext(Configuration) - Static method in class org.apache.crunch.test.CrunchTestSupport
-
The method creates a TaskInputOutputContext which can be used
in unit tests.
- getTupleFactory(Class<T>) - Static method in class org.apache.crunch.types.TupleFactory
-
- getType() - Method in interface org.apache.crunch.Source
-
Returns the PType for this source.
- getTypeClass() - Method in class org.apache.crunch.types.avro.AvroType
-
- getTypeClass() - Method in interface org.apache.crunch.types.PType
-
Returns the Java type represented by this PType.
- getTypeClass() - Method in class org.apache.crunch.types.writable.WritableType
-
- getTypeFamily() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
-
- getTypeFamily() - Method in interface org.apache.crunch.PCollection
-
Returns the PTypeFamily of this PCollection.
- getTypeInfo(Class<?>) - Static method in class org.apache.crunch.types.orc.OrcUtils
-
Generate TypeInfo for a given java class based on reflection
- getTypeName() - Method in class org.apache.crunch.types.orc.TupleObjectInspector
-
- getValue() - Method in interface org.apache.crunch.PObject
-
Gets the value associated with this PObject.
- getValue() - Method in class org.apache.crunch.types.writable.UnionWritable
-
- getValue() - Method in class org.apache.crunch.Union
-
Returns the underlying object value of the record.
- getValue() - Method in class org.apache.hadoop.mapred.SparkCounter
-
- getValueClass() - Method in interface org.apache.crunch.types.Converter
-
- getValues() - Method in class org.apache.crunch.TupleN
-
- getValueType() - Method in class org.apache.crunch.impl.dist.collect.PTableBase
-
- getValueType() - Method in interface org.apache.crunch.PTable
-
Returns the PType of the value.
- getValueType() - Method in interface org.apache.crunch.types.PTableType
-
Returns the value type for the table.
- getWriter(Schema) - Method in class org.apache.crunch.types.avro.AvroMode
-
Creates a DatumWriter based on the schema.
- getWriter(Schema) - Method in interface org.apache.crunch.types.avro.ReaderWriterFactory
-
- getWriter(Schema) - Method in class org.apache.crunch.types.avro.ReflectDataFactory
-
- globalToplist(PCollection<X>) - Static method in class org.apache.crunch.lib.TopList
-
Create a list of unique items in the input collection with their count, sorted descending by their frequency.
- groupByKey() - Method in class org.apache.crunch.impl.dist.collect.PTableBase
-
- groupByKey(int) - Method in class org.apache.crunch.impl.dist.collect.PTableBase
-
- groupByKey(GroupingOptions) - Method in class org.apache.crunch.impl.dist.collect.PTableBase
-
- groupByKey() - Method in interface org.apache.crunch.PTable
-
Performs a grouping operation on the keys of this table.
- groupByKey(int) - Method in interface org.apache.crunch.PTable
-
Performs a grouping operation on the keys of this table, using the given
number of partitions.
- groupByKey(GroupingOptions) - Method in interface org.apache.crunch.PTable
-
Performs a grouping operation on the keys of this table, using the
additional GroupingOptions to control how the grouping is executed.
- groupedWeightedReservoirSample(PTable<Integer, Pair<T, N>>, int[]) - Static method in class org.apache.crunch.lib.Sample
-
The most general purpose of the weighted reservoir sampling patterns that allows us to choose
a random sample of elements for each of N input groups.
- groupedWeightedReservoirSample(PTable<Integer, Pair<T, N>>, int[], Long) - Static method in class org.apache.crunch.lib.Sample
-
Same as the other groupedWeightedReservoirSample method, but include a seed for testing
purposes.
- groupingComparatorClass(Class<? extends RawComparator>) - Method in class org.apache.crunch.GroupingOptions.Builder
-
- GroupingOptions - Class in org.apache.crunch
-
Options that can be passed to a groupByKey operation in order to
exercise finer control over how the partitioning, grouping, and sorting of
keys is performed.
- GroupingOptions.Builder - Class in org.apache.crunch
-
Builder class for creating GroupingOptions instances.
- GroupingOptions.Builder() - Constructor for class org.apache.crunch.GroupingOptions.Builder
-
- GuavaUtils - Class in org.apache.crunch.impl.spark
-
- GuavaUtils() - Constructor for class org.apache.crunch.impl.spark.GuavaUtils
-
- gzip(T) - Static method in class org.apache.crunch.io.Compress
-
Configure the given output target to be compressed using Gzip.
- main(String[]) - Static method in class org.apache.crunch.examples.AverageBytesByIP
-
- main(String[]) - Static method in class org.apache.crunch.examples.SecondarySortExample
-
- main(String[]) - Static method in class org.apache.crunch.examples.SortExample
-
- main(String[]) - Static method in class org.apache.crunch.examples.TotalBytesByIP
-
- main(String[]) - Static method in class org.apache.crunch.examples.TotalWordCount
-
- main(String[]) - Static method in class org.apache.crunch.examples.WordAggregationHBase
-
- main(String[]) - Static method in class org.apache.crunch.examples.WordCount
-
- makeTuple(Object...) - Method in class org.apache.crunch.types.TupleFactory
-
- map(R) - Method in class org.apache.crunch.fn.CompositeMapFn
-
- map(V) - Method in class org.apache.crunch.fn.ExtractKeyFn
-
- map(T) - Method in class org.apache.crunch.fn.IdentityFn
-
- map(Pair<K, V>) - Method in class org.apache.crunch.fn.PairMapFn
-
- map(T) - Method in class org.apache.crunch.fn.SDoubleFunction
-
- map(T) - Method in class org.apache.crunch.fn.SFunction
-
- map(Pair<K, V>) - Method in class org.apache.crunch.fn.SFunction2
-
- map(T) - Method in class org.apache.crunch.fn.SPairFunction
-
- map(Pair<V1, V2>) - Method in class org.apache.crunch.fn.SwapFn
-
- map(PTable<K1, V1>, Class<? extends Mapper<K1, V1, K2, V2>>, Class<K2>, Class<V2>) - Static method in class org.apache.crunch.lib.Mapred
-
- map(PTable<K1, V1>, Class<? extends Mapper<K1, V1, K2, V2>>, Class<K2>, Class<V2>) - Static method in class org.apache.crunch.lib.Mapreduce
-
- map(V) - Method in class org.apache.crunch.lib.sort.SortFns.AvroGenericFn
-
- map(V) - Method in class org.apache.crunch.lib.sort.SortFns.SingleKeyFn
-
- map(V) - Method in class org.apache.crunch.lib.sort.SortFns.TupleKeyFn
-
- map(S) - Method in class org.apache.crunch.MapFn
-
Maps the given input into an instance of the output type.
- map(Pair<Object, Iterable<Object>>) - Method in class org.apache.crunch.types.PGroupedTableType.PairIterableMapFn
-
- MapDeepCopier<T> - Class in org.apache.crunch.types
-
- MapDeepCopier(PType<T>) - Constructor for class org.apache.crunch.types.MapDeepCopier
-
- MapFn<S,T> - Class in org.apache.crunch
-
A
DoFn for the common case of emitting exactly one value for each
input record.
- MapFn() - Constructor for class org.apache.crunch.MapFn
-
- MapFunction - Class in org.apache.crunch.impl.spark.fn
-
- MapFunction(MapFn, SparkRuntimeContext) - Constructor for class org.apache.crunch.impl.spark.fn.MapFunction
-
- mapKeys(MapFn<K, K2>, PType<K2>) - Method in class org.apache.crunch.impl.dist.collect.PTableBase
-
- mapKeys(String, MapFn<K, K2>, PType<K2>) - Method in class org.apache.crunch.impl.dist.collect.PTableBase
-
- mapKeys(PTable<K1, V>, MapFn<K1, K2>, PType<K2>) - Static method in class org.apache.crunch.lib.PTables
-
Maps a PTable<K1, V> to a PTable<K2, V> using the given MapFn<K1, K2> on
the keys of the PTable.
- mapKeys(String, PTable<K1, V>, MapFn<K1, K2>, PType<K2>) - Static method in class org.apache.crunch.lib.PTables
-
Maps a PTable<K1, V> to a PTable<K2, V> using the given MapFn<K1, K2> on
the keys of the PTable.
- mapKeys(MapFn<K, K2>, PType<K2>) - Method in interface org.apache.crunch.PTable
-
Returns a PTable that has the same values as this instance, but
uses the given function to map the keys.
- mapKeys(String, MapFn<K, K2>, PType<K2>) - Method in interface org.apache.crunch.PTable
-
Returns a PTable that has the same values as this instance, but
uses the given function to map the keys.
- MapOutputFunction<K,V> - Class in org.apache.crunch.impl.spark.fn
-
- MapOutputFunction(SerDe, SerDe) - Constructor for class org.apache.crunch.impl.spark.fn.MapOutputFunction
-
- Mapred - Class in org.apache.crunch.lib
-
Static functions for working with legacy Mappers and Reducers that live under the org.apache.hadoop.mapred.*
package as part of Crunch pipelines.
- Mapred() - Constructor for class org.apache.crunch.lib.Mapred
-
- Mapreduce - Class in org.apache.crunch.lib
-
Static functions for working with legacy Mappers and Reducers that live under the org.apache.hadoop.mapreduce.*
package as part of Crunch pipelines.
- Mapreduce() - Constructor for class org.apache.crunch.lib.Mapreduce
-
- MapReduceTarget - Interface in org.apache.crunch.io
-
- maps(PType<T>) - Static method in class org.apache.crunch.types.avro.Avros
-
- maps(PType<T>) - Method in class org.apache.crunch.types.avro.AvroTypeFamily
-
- maps(PType<T>) - Method in interface org.apache.crunch.types.PTypeFamily
-
- maps(PType<T>) - Static method in class org.apache.crunch.types.writable.Writables
-
- maps(PType<T>) - Method in class org.apache.crunch.types.writable.WritableTypeFamily
-
- MapsideJoinStrategy<K,U,V> - Class in org.apache.crunch.lib.join
-
Utility for doing map side joins on a common key between two
PTables.
- MapsideJoinStrategy() - Constructor for class org.apache.crunch.lib.join.MapsideJoinStrategy
-
- MapsideJoinStrategy(boolean) - Constructor for class org.apache.crunch.lib.join.MapsideJoinStrategy
-
- mapValues(MapFn<Iterable<V>, U>, PType<U>) - Method in class org.apache.crunch.impl.dist.collect.BaseGroupedTable
-
- mapValues(String, MapFn<Iterable<V>, U>, PType<U>) - Method in class org.apache.crunch.impl.dist.collect.BaseGroupedTable
-
- mapValues(MapFn<V, U>, PType<U>) - Method in class org.apache.crunch.impl.dist.collect.PTableBase
-
- mapValues(String, MapFn<V, U>, PType<U>) - Method in class org.apache.crunch.impl.dist.collect.PTableBase
-
- mapValues(PTable<K, U>, MapFn<U, V>, PType<V>) - Static method in class org.apache.crunch.lib.PTables
-
Maps a PTable<K, U> to a PTable<K, V> using the given MapFn<U, V> on
the values of the PTable.
- mapValues(String, PTable<K, U>, MapFn<U, V>, PType<V>) - Static method in class org.apache.crunch.lib.PTables
-
Maps a PTable<K, U> to a PTable<K, V> using the given MapFn<U, V> on
the values of the PTable.
- mapValues(PGroupedTable<K, U>, MapFn<Iterable<U>, V>, PType<V>) - Static method in class org.apache.crunch.lib.PTables
-
An analogue of the mapValues function for PGroupedTable<K, U> collections.
- mapValues(String, PGroupedTable<K, U>, MapFn<Iterable<U>, V>, PType<V>) - Static method in class org.apache.crunch.lib.PTables
-
An analogue of the mapValues function for PGroupedTable<K, U> collections.
- mapValues(MapFn<Iterable<V>, U>, PType<U>) - Method in interface org.apache.crunch.PGroupedTable
-
Maps the Iterable<V> elements of each record to a new type.
- mapValues(String, MapFn<Iterable<V>, U>, PType<U>) - Method in interface org.apache.crunch.PGroupedTable
-
Maps the Iterable<V> elements of each record to a new type.
- mapValues(MapFn<V, U>, PType<U>) - Method in interface org.apache.crunch.PTable
-
Returns a PTable that has the same keys as this instance, but
uses the given function to map the values.
- mapValues(String, MapFn<V, U>, PType<U>) - Method in interface org.apache.crunch.PTable
-
Returns a PTable that has the same keys as this instance, but
uses the given function to map the values.
- markLogged() - Method in exception org.apache.crunch.CrunchRuntimeException
-
Indicate that this exception has been written to the debug logs.
- materialize() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
-
- materialize(PCollection<T>) - Method in class org.apache.crunch.impl.mem.MemPipeline
-
- materialize(PCollection<T>) - Method in class org.apache.crunch.impl.mr.MRPipeline
-
- materialize(PCollection<T>) - Method in class org.apache.crunch.impl.spark.SparkPipeline
-
- materialize() - Method in interface org.apache.crunch.PCollection
-
Returns a reference to the data set represented by this PCollection that
may be used by the client to read the data locally.
- materialize(PCollection<T>) - Method in interface org.apache.crunch.Pipeline
-
Create the given PCollection and read the data it contains into the
returned Collection instance for client use.
- materialize(PCollection<T>) - Method in class org.apache.crunch.util.CrunchTool
-
- materializeAt(SourceTarget<S>) - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
-
- materializeToMap() - Method in class org.apache.crunch.impl.dist.collect.PTableBase
-
Returns a Map made up of the keys and values in this PTable.
- materializeToMap() - Method in interface org.apache.crunch.PTable
-
Returns a Map made up of the keys and values in this PTable.
- max() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
-
- max(PCollection<S>) - Static method in class org.apache.crunch.lib.Aggregate
-
Returns the largest numerical element from the input collection.
- max() - Method in interface org.apache.crunch.PCollection
-
Returns a PObject of the maximum element of this instance.
- MAX_BIGINTS() - Static method in class org.apache.crunch.fn.Aggregators
-
Return the maximum of all given
BigInteger values.
- MAX_BIGINTS(int) - Static method in class org.apache.crunch.fn.Aggregators
-
Return the
n largest
BigInteger values (or fewer if there are fewer
values than
n).
- MAX_DOUBLES() - Static method in class org.apache.crunch.fn.Aggregators
-
Return the maximum of all given double values.
- MAX_DOUBLES(int) - Static method in class org.apache.crunch.fn.Aggregators
-
Return the n largest double values (or fewer if there are fewer
values than n).
- MAX_FLOATS() - Static method in class org.apache.crunch.fn.Aggregators
-
Return the maximum of all given float values.
- MAX_FLOATS(int) - Static method in class org.apache.crunch.fn.Aggregators
-
Return the n largest float values (or fewer if there are fewer
values than n).
- MAX_INTS() - Static method in class org.apache.crunch.fn.Aggregators
-
Return the maximum of all given int values.
- MAX_INTS(int) - Static method in class org.apache.crunch.fn.Aggregators
-
Return the n largest int values (or fewer if there are fewer
values than n).
- MAX_LONGS() - Static method in class org.apache.crunch.fn.Aggregators
-
Return the maximum of all given long values.
- MAX_LONGS(int) - Static method in class org.apache.crunch.fn.Aggregators
-
Return the n largest long values (or fewer if there are fewer
values than n).
- MAX_N(int, Class<V>) - Static method in class org.apache.crunch.fn.Aggregators
-
Return the n largest values (or fewer if there are fewer
values than n).
- MAX_REDUCERS - Static variable in class org.apache.crunch.util.PartitionUtils
-
Set an upper limit on the number of reducers the Crunch planner will set for an MR
job when it tries to determine how many reducers to use based on the input size.
- MAX_UNIQUE_N(int, Class<V>) - Static method in class org.apache.crunch.fn.Aggregators
-
Return the n largest unique values (or fewer if there are fewer
values than n).
- meanValue(PTable<K, V>) - Static method in class org.apache.crunch.lib.Average
-
Calculate the mean average value by key for a table with numeric values.
- MemPipeline - Class in org.apache.crunch.impl.mem
-
- min() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
-
- min(PCollection<S>) - Static method in class org.apache.crunch.lib.Aggregate
-
Returns the smallest numerical element from the input collection.
- min() - Method in interface org.apache.crunch.PCollection
-
Returns a PObject of the minimum element of this instance.
- MIN_BIGINTS() - Static method in class org.apache.crunch.fn.Aggregators
-
Return the minimum of all given
BigInteger values.
- MIN_BIGINTS(int) - Static method in class org.apache.crunch.fn.Aggregators
-
Return the
n smallest
BigInteger values (or fewer if there are fewer
values than
n).
- MIN_DOUBLES() - Static method in class org.apache.crunch.fn.Aggregators
-
Return the minimum of all given double values.
- MIN_DOUBLES(int) - Static method in class org.apache.crunch.fn.Aggregators
-
Return the n smallest double values (or fewer if there are fewer
values than n).
- MIN_FLOATS() - Static method in class org.apache.crunch.fn.Aggregators
-
Return the minimum of all given float values.
- MIN_FLOATS(int) - Static method in class org.apache.crunch.fn.Aggregators
-
Return the n smallest float values (or fewer if there are fewer
values than n).
- MIN_INTS() - Static method in class org.apache.crunch.fn.Aggregators
-
Return the minimum of all given int values.
- MIN_INTS(int) - Static method in class org.apache.crunch.fn.Aggregators
-
Return the n smallest int values (or fewer if there are fewer
values than n).
- MIN_LONGS() - Static method in class org.apache.crunch.fn.Aggregators
-
Return the minimum of all given long values.
- MIN_LONGS(int) - Static method in class org.apache.crunch.fn.Aggregators
-
Return the n smallest long values (or fewer if there are fewer
values than n).
- MIN_N(int, Class<V>) - Static method in class org.apache.crunch.fn.Aggregators
-
Return the n smallest values (or fewer if there are fewer
values than n).
- MIN_UNIQUE_N(int, Class<V>) - Static method in class org.apache.crunch.fn.Aggregators
-
Returns the n smallest unique values (or fewer if there are fewer unique values than n).
- MRCollection - Interface in org.apache.crunch.impl.dist.collect
-
- MRJob - Interface in org.apache.crunch.impl.mr
-
A Hadoop MapReduce job managed by Crunch.
- MRJob.State - Enum in org.apache.crunch.impl.mr
-
A job will be in one of the following states.
- MRPipeline - Class in org.apache.crunch.impl.mr
-
Pipeline implementation that is executed within Hadoop MapReduce.
- MRPipeline(Class<?>) - Constructor for class org.apache.crunch.impl.mr.MRPipeline
-
Instantiate with a default Configuration and name.
- MRPipeline(Class<?>, String) - Constructor for class org.apache.crunch.impl.mr.MRPipeline
-
Instantiate with a custom pipeline name.
- MRPipeline(Class<?>, Configuration) - Constructor for class org.apache.crunch.impl.mr.MRPipeline
-
Instantiate with a custom configuration and default naming.
- MRPipeline(Class<?>, String, Configuration) - Constructor for class org.apache.crunch.impl.mr.MRPipeline
-
Instantiate with a custom name and configuration.
- MRPipelineExecution - Interface in org.apache.crunch.impl.mr
-
- of(T, U) - Static method in class org.apache.crunch.Pair
-
- of(A, B, C) - Static method in class org.apache.crunch.Tuple3
-
- of(A, B, C, D) - Static method in class org.apache.crunch.Tuple4
-
- of(Object...) - Static method in class org.apache.crunch.TupleN
-
- OneToManyJoin - Class in org.apache.crunch.lib.join
-
Optimized join for situations where exactly one value is being joined with
any other number of values based on a common key.
- OneToManyJoin() - Constructor for class org.apache.crunch.lib.join.OneToManyJoin
-
- oneToManyJoin(PTable<K, U>, PTable<K, V>, DoFn<Pair<U, Iterable<V>>, T>, PType<T>) - Static method in class org.apache.crunch.lib.join.OneToManyJoin
-
Performs a join on two tables, where the left table only contains a single
value per key.
- oneToManyJoin(PTable<K, U>, PTable<K, V>, DoFn<Pair<U, Iterable<V>>, T>, PType<T>, int) - Static method in class org.apache.crunch.lib.join.OneToManyJoin
-
Supports a user-specified number of reducers for the one-to-many join.
- or(FilterFn<S>, FilterFn<S>) - Static method in class org.apache.crunch.fn.FilterFns
-
Accept an entry if at least one of the given filters accept it, using short-circuit evaluation.
- or(FilterFn<S>...) - Static method in class org.apache.crunch.fn.FilterFns
-
Accept an entry if at least one of the given filters accept it, using short-circuit evaluation.
- Orcs - Class in org.apache.crunch.types.orc
-
Utilities to create PTypes for ORC serialization / deserialization
- Orcs() - Constructor for class org.apache.crunch.types.orc.Orcs
-
- orcs(TypeInfo) - Static method in class org.apache.crunch.types.orc.Orcs
-
Create a PType to directly use OrcStruct as the deserialized format.
- OrcUtils - Class in org.apache.crunch.types.orc
-
- OrcUtils() - Constructor for class org.apache.crunch.types.orc.OrcUtils
-
- order() - Method in class org.apache.crunch.lib.Sort.ColumnOrder
-
- org.apache.crunch - package org.apache.crunch
-
Client-facing API and core abstractions.
- org.apache.crunch.contrib - package org.apache.crunch.contrib
-
User contributions that may be interesting for special applications.
- org.apache.crunch.contrib.bloomfilter - package org.apache.crunch.contrib.bloomfilter
-
Support for creating Bloom Filters.
- org.apache.crunch.contrib.io.jdbc - package org.apache.crunch.contrib.io.jdbc
-
Support for reading data from RDBMS using JDBC
- org.apache.crunch.contrib.text - package org.apache.crunch.contrib.text
-
- org.apache.crunch.examples - package org.apache.crunch.examples
-
Example applications demonstrating various aspects of Crunch.
- org.apache.crunch.fn - package org.apache.crunch.fn
-
Commonly used functions for manipulating collections.
- org.apache.crunch.impl - package org.apache.crunch.impl
-
- org.apache.crunch.impl.dist - package org.apache.crunch.impl.dist
-
- org.apache.crunch.impl.dist.collect - package org.apache.crunch.impl.dist.collect
-
- org.apache.crunch.impl.mem - package org.apache.crunch.impl.mem
-
In-memory Pipeline implementation for rapid prototyping and testing.
- org.apache.crunch.impl.mr - package org.apache.crunch.impl.mr
-
A Pipeline implementation that runs on Hadoop MapReduce.
- org.apache.crunch.impl.spark - package org.apache.crunch.impl.spark
-
- org.apache.crunch.impl.spark.collect - package org.apache.crunch.impl.spark.collect
-
- org.apache.crunch.impl.spark.fn - package org.apache.crunch.impl.spark.fn
-
- org.apache.crunch.impl.spark.serde - package org.apache.crunch.impl.spark.serde
-
- org.apache.crunch.io - package org.apache.crunch.io
-
Data input and output for Pipelines.
- org.apache.crunch.lib - package org.apache.crunch.lib
-
Joining, sorting, aggregating, and other commonly used functionality.
- org.apache.crunch.lib.join - package org.apache.crunch.lib.join
-
Inner and outer joins on collections.
- org.apache.crunch.lib.sort - package org.apache.crunch.lib.sort
-
- org.apache.crunch.test - package org.apache.crunch.test
-
Utilities for testing Crunch-based applications.
- org.apache.crunch.types - package org.apache.crunch.types
-
Common functionality for business object serialization.
- org.apache.crunch.types.avro - package org.apache.crunch.types.avro
-
Business object serialization using Apache Avro.
- org.apache.crunch.types.orc - package org.apache.crunch.types.orc
-
- org.apache.crunch.types.writable - package org.apache.crunch.types.writable
-
Business object serialization using Hadoop's Writables framework.
- org.apache.crunch.util - package org.apache.crunch.util
-
An assorted set of utilities.
- org.apache.hadoop.mapred - package org.apache.hadoop.mapred
-
- outputConf(String, String) - Method in interface org.apache.crunch.Target
-
Adds the given key-value pair to the Configuration instance that is used to write
this Target.
- OutputConverterFunction<K,V,S> - Class in org.apache.crunch.impl.spark.fn
-
- OutputConverterFunction(Converter<K, V, S, ?>) - Constructor for class org.apache.crunch.impl.spark.fn.OutputConverterFunction
-
- OutputHandler - Interface in org.apache.crunch.io
-
- outputKey(S) - Method in interface org.apache.crunch.types.Converter
-
- outputValue(S) - Method in interface org.apache.crunch.types.Converter
-
- override(ReaderWriterFactory) - Method in class org.apache.crunch.types.avro.AvroMode
-
- overridePathProperties(Configuration) - Method in class org.apache.crunch.test.TemporaryPath
-
Set all keys specified in the constructor to temporary directories.
- Pair<K,V> - Class in org.apache.crunch
-
A convenience class for two-element
Tuples.
- Pair(K, V) - Constructor for class org.apache.crunch.Pair
-
- PAIR - Static variable in class org.apache.crunch.types.TupleFactory
-
- pair2tupleFunc() - Static method in class org.apache.crunch.impl.spark.GuavaUtils
-
- pairAggregator(Aggregator<V1>, Aggregator<V2>) - Static method in class org.apache.crunch.fn.Aggregators
-
Apply separate aggregators to each component of a
Pair.
- PairFlatMapDoFn<T,K,V> - Class in org.apache.crunch.impl.spark.fn
-
- PairFlatMapDoFn(DoFn<T, Pair<K, V>>, SparkRuntimeContext) - Constructor for class org.apache.crunch.impl.spark.fn.PairFlatMapDoFn
-
- PairMapFn<K,V,S,T> - Class in org.apache.crunch.fn
-
- PairMapFn(MapFn<K, S>, MapFn<V, T>) - Constructor for class org.apache.crunch.fn.PairMapFn
-
- PairMapFunction<K,V,S> - Class in org.apache.crunch.impl.spark.fn
-
- PairMapFunction(MapFn<Pair<K, V>, S>, SparkRuntimeContext) - Constructor for class org.apache.crunch.impl.spark.fn.PairMapFunction
-
- PairMapIterableFunction<K,V,S,T> - Class in org.apache.crunch.impl.spark.fn
-
- PairMapIterableFunction(MapFn<Pair<K, List<V>>, Pair<S, Iterable<T>>>, SparkRuntimeContext) - Constructor for class org.apache.crunch.impl.spark.fn.PairMapIterableFunction
-
- pairs(PType<V1>, PType<V2>) - Static method in class org.apache.crunch.types.avro.Avros
-
- pairs(PType<V1>, PType<V2>) - Method in class org.apache.crunch.types.avro.AvroTypeFamily
-
- pairs(PType<V1>, PType<V2>) - Method in interface org.apache.crunch.types.PTypeFamily
-
- pairs(PType<V1>, PType<V2>) - Static method in class org.apache.crunch.types.writable.Writables
-
- pairs(PType<V1>, PType<V2>) - Method in class org.apache.crunch.types.writable.WritableTypeFamily
-
- parallelDo(DoFn<S, T>, PType<T>) - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
-
- parallelDo(String, DoFn<S, T>, PType<T>) - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
-
- parallelDo(String, DoFn<S, T>, PType<T>, ParallelDoOptions) - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
-
- parallelDo(DoFn<S, Pair<K, V>>, PTableType<K, V>) - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
-
- parallelDo(String, DoFn<S, Pair<K, V>>, PTableType<K, V>) - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
-
- parallelDo(String, DoFn<S, Pair<K, V>>, PTableType<K, V>, ParallelDoOptions) - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
-
- parallelDo(DoFn<S, T>, PType<T>) - Method in interface org.apache.crunch.PCollection
-
Applies the given doFn to the elements of this PCollection and
returns a new PCollection that is the output of this processing.
- parallelDo(String, DoFn<S, T>, PType<T>) - Method in interface org.apache.crunch.PCollection
-
Applies the given doFn to the elements of this PCollection and
returns a new PCollection that is the output of this processing.
- parallelDo(String, DoFn<S, T>, PType<T>, ParallelDoOptions) - Method in interface org.apache.crunch.PCollection
-
Applies the given doFn to the elements of this PCollection and
returns a new PCollection that is the output of this processing.
- parallelDo(DoFn<S, Pair<K, V>>, PTableType<K, V>) - Method in interface org.apache.crunch.PCollection
-
Similar to the other parallelDo instance, but returns a
PTable instance instead of a PCollection.
- parallelDo(String, DoFn<S, Pair<K, V>>, PTableType<K, V>) - Method in interface org.apache.crunch.PCollection
-
Similar to the other parallelDo instance, but returns a
PTable instance instead of a PCollection.
- parallelDo(String, DoFn<S, Pair<K, V>>, PTableType<K, V>, ParallelDoOptions) - Method in interface org.apache.crunch.PCollection
-
Similar to the other parallelDo instance, but returns a
PTable instance instead of a PCollection.
- ParallelDoOptions - Class in org.apache.crunch
-
Container class that includes optional information about a parallelDo operation
applied to a PCollection.
- ParallelDoOptions.Builder - Class in org.apache.crunch
-
- ParallelDoOptions.Builder() - Constructor for class org.apache.crunch.ParallelDoOptions.Builder
-
- parallelism(int) - Static method in class org.apache.crunch.CreateOptions
-
- Parse - Class in org.apache.crunch.contrib.text
-
Methods for parsing instances of PCollection<String> into PCollection's of strongly-typed
tuples.
- parse(String, PCollection<String>, Extractor<T>) - Static method in class org.apache.crunch.contrib.text.Parse
-
Parses the lines of the input PCollection<String> and returns a PCollection<T> using
the given Extractor<T>.
- parse(String, PCollection<String>, PTypeFamily, Extractor<T>) - Static method in class org.apache.crunch.contrib.text.Parse
-
Parses the lines of the input PCollection<String> and returns a PCollection<T> using
the given Extractor<T> that uses the given PTypeFamily.
- parseTable(String, PCollection<String>, Extractor<Pair<K, V>>) - Static method in class org.apache.crunch.contrib.text.Parse
-
Parses the lines of the input PCollection<String> and returns a PTable<K, V> using
the given Extractor<Pair<K, V>>.
- parseTable(String, PCollection<String>, PTypeFamily, Extractor<Pair<K, V>>) - Static method in class org.apache.crunch.contrib.text.Parse
-
Parses the lines of the input PCollection<String> and returns a PTable<K, V> using
the given Extractor<Pair<K, V>> that uses the given PTypeFamily.
- partition - Variable in class org.apache.crunch.impl.spark.IntByteArray
-
- PartitionedMapOutputFunction<K,V> - Class in org.apache.crunch.impl.spark.fn
-
- PartitionedMapOutputFunction(SerDe<K>, SerDe<V>, PGroupedTableType<K, V>, Class<? extends Partitioner>, int, SparkRuntimeContext) - Constructor for class org.apache.crunch.impl.spark.fn.PartitionedMapOutputFunction
-
- PARTITIONER_PATH - Static variable in class org.apache.crunch.lib.sort.TotalOrderPartitioner
-
- partitionerClass(Class<? extends Partitioner>) - Method in class org.apache.crunch.GroupingOptions.Builder
-
- PartitionUtils - Class in org.apache.crunch.util
-
Helper functions and settings for determining the number of reducers to use in a pipeline
job created by the Crunch planner.
- PartitionUtils() - Constructor for class org.apache.crunch.util.PartitionUtils
-
- PathTarget - Interface in org.apache.crunch.io
-
A target whose output goes to a given path on a file system.
- PCollection<S> - Interface in org.apache.crunch
-
A representation of an immutable, distributed collection of elements that is
the fundamental target of computations in Crunch.
- PCollectionFactory - Interface in org.apache.crunch.impl.dist.collect
-
- PCollectionImpl<S> - Class in org.apache.crunch.impl.dist.collect
-
- PCollectionImpl(String, DistributedPipeline) - Constructor for class org.apache.crunch.impl.dist.collect.PCollectionImpl
-
- PCollectionImpl(String, DistributedPipeline, ParallelDoOptions) - Constructor for class org.apache.crunch.impl.dist.collect.PCollectionImpl
-
- PCollectionImpl.Visitor - Interface in org.apache.crunch.impl.dist.collect
-
- PGroupedTable<K,V> - Interface in org.apache.crunch
-
The Crunch representation of a grouped
PTable, which corresponds to the output of
the shuffle phase of a MapReduce job.
- PGroupedTableImpl<K,V> - Class in org.apache.crunch.impl.spark.collect
-
- PGroupedTableType<K,V> - Class in org.apache.crunch.types
-
- PGroupedTableType(PTableType<K, V>) - Constructor for class org.apache.crunch.types.PGroupedTableType
-
- PGroupedTableType.PairIterableMapFn<K,V> - Class in org.apache.crunch.types
-
- PGroupedTableType.PairIterableMapFn(MapFn<Object, K>, MapFn<Object, V>) - Constructor for class org.apache.crunch.types.PGroupedTableType.PairIterableMapFn
-
- Pipeline - Interface in org.apache.crunch
-
Manages the state of a pipeline execution.
- PipelineCallable<Output> - Class in org.apache.crunch
-
A specialization of Callable that executes some sequential logic on the client machine as
part of an overall Crunch pipeline in order to generate zero or more outputs, some of
which may be PCollection instances that are processed by other jobs in the
pipeline.
- PipelineCallable() - Constructor for class org.apache.crunch.PipelineCallable
-
- PipelineCallable.Status - Enum in org.apache.crunch
-
- PipelineExecution - Interface in org.apache.crunch
-
A handle to allow clients to control a Crunch pipeline as it runs.
- PipelineExecution.Status - Enum in org.apache.crunch
-
- PipelineResult - Class in org.apache.crunch
-
Container for the results of a call to run or done on the
Pipeline interface that includes details and statistics about the component
stages of the data pipeline.
- PipelineResult(List<PipelineResult.StageResult>, PipelineExecution.Status) - Constructor for class org.apache.crunch.PipelineResult
-
- PipelineResult.StageResult - Class in org.apache.crunch
-
- PipelineResult.StageResult(String, Counters) - Constructor for class org.apache.crunch.PipelineResult.StageResult
-
- PipelineResult.StageResult(String, Counters, long, long) - Constructor for class org.apache.crunch.PipelineResult.StageResult
-
- PipelineResult.StageResult(String, String, Counters, long, long, long, long) - Constructor for class org.apache.crunch.PipelineResult.StageResult
-
- plan() - Method in class org.apache.crunch.impl.mr.MRPipeline
-
- PObject<T> - Interface in org.apache.crunch
-
A PObject represents a singleton object value that results from a distributed
computation.
- process(S, Emitter<Pair<String, BloomFilter>>) - Method in class org.apache.crunch.contrib.bloomfilter.BloomFilterFn
-
- process(S, Emitter<T>) - Method in class org.apache.crunch.DoFn
-
- process(T, Emitter<T>) - Method in class org.apache.crunch.FilterFn
-
- process(T, Emitter<Double>) - Method in class org.apache.crunch.fn.SDoubleFlatMapFunction
-
- process(T, Emitter<R>) - Method in class org.apache.crunch.fn.SFlatMapFunction
-
- process(Pair<K, V>, Emitter<R>) - Method in class org.apache.crunch.fn.SFlatMapFunction2
-
- process(T, Emitter<Pair<K, V>>) - Method in class org.apache.crunch.fn.SPairFlatMapFunction
-
- process(Pair<Integer, Iterable<Pair<K, V>>>, Emitter<Pair<Integer, Pair<K, V>>>) - Method in class org.apache.crunch.lib.Aggregate.TopKCombineFn
-
- process(Pair<K, V>, Emitter<Pair<Integer, Pair<K, V>>>) - Method in class org.apache.crunch.lib.Aggregate.TopKFn
-
- process(Pair<Pair<K, Integer>, Iterable<Pair<U, V>>>, Emitter<Pair<K, Pair<U, V>>>) - Method in class org.apache.crunch.lib.join.JoinFn
-
Split up the input record to make coding a bit more manageable.
- process(S, Emitter<T>) - Method in class org.apache.crunch.MapFn
-
- Protos - Class in org.apache.crunch.types
-
Utility functions for working with protocol buffers in Crunch.
- Protos() - Constructor for class org.apache.crunch.types.Protos
-
- protos(Class<T>, PTypeFamily) - Static method in class org.apache.crunch.types.PTypes
-
Constructs a PType for the given protocol buffer.
- protos(Class<T>, PTypeFamily, SerializableSupplier<ExtensionRegistry>) - Static method in class org.apache.crunch.types.PTypes
-
Constructs a PType for a protocol buffer, using the given SerializableSupplier to provide
an ExtensionRegistry to use in reading the given protobuf.
- PTable<K,V> - Interface in org.apache.crunch
-
A sub-interface of PCollection that represents an immutable,
distributed multi-map of keys and values.
- PTableBase<K,V> - Class in org.apache.crunch.impl.dist.collect
-
- PTableBase(String, DistributedPipeline) - Constructor for class org.apache.crunch.impl.dist.collect.PTableBase
-
- PTableBase(String, DistributedPipeline, ParallelDoOptions) - Constructor for class org.apache.crunch.impl.dist.collect.PTableBase
-
- PTables - Class in org.apache.crunch.lib
-
Methods for performing common operations on PTables.
- PTables() - Constructor for class org.apache.crunch.lib.PTables
-
- PTableType<K,V> - Interface in org.apache.crunch.types
-
An extension of
PType specifically for
PTable objects.
- ptype(PType<Pair<V1, V2>>) - Static method in class org.apache.crunch.fn.SwapFn
-
- pType(PType<V>) - Static method in class org.apache.crunch.lib.Quantiles.Result
-
Create a PType for the result type, to be stored as a derived type from Crunch primitives
- PType<T> - Interface in org.apache.crunch.types
-
A PType defines a mapping between a data type that is used in a Crunch pipeline and a
serialization and storage format that is used to read/write data from/to HDFS.
- PTypeFamily - Interface in org.apache.crunch.types
-
An abstract factory for creating PType instances that have the same
serialization/storage backing format.
- PTypes - Class in org.apache.crunch.types
-
Utility functions for creating common types of derived PTypes, e.g., for JSON
data, protocol buffers, and Thrift records.
- PTypes() - Constructor for class org.apache.crunch.types.PTypes
-
- PTypeUtils - Class in org.apache.crunch.types
-
Utilities for converting between PTypes from different
PTypeFamily implementations.
- read(Source<S>) - Method in class org.apache.crunch.impl.dist.DistributedPipeline
-
- read(Source<S>, String) - Method in class org.apache.crunch.impl.dist.DistributedPipeline
-
- read(TableSource<K, V>) - Method in class org.apache.crunch.impl.dist.DistributedPipeline
-
- read(TableSource<K, V>, String) - Method in class org.apache.crunch.impl.dist.DistributedPipeline
-
- read(Source<T>) - Method in class org.apache.crunch.impl.mem.MemPipeline
-
- read(Source<T>, String) - Method in class org.apache.crunch.impl.mem.MemPipeline
-
- read(TableSource<K, V>) - Method in class org.apache.crunch.impl.mem.MemPipeline
-
- read(TableSource<K, V>, String) - Method in class org.apache.crunch.impl.mem.MemPipeline
-
- read(FileSystem, Path) - Method in interface org.apache.crunch.io.FileReaderFactory
-
- read(Configuration) - Method in interface org.apache.crunch.io.ReadableSource
-
Returns an Iterable that contains the contents of this source.
- read(Source<T>) - Method in interface org.apache.crunch.Pipeline
-
Converts the given Source into a PCollection that is
available to jobs run using this Pipeline instance.
- read(Source<T>, String) - Method in interface org.apache.crunch.Pipeline
-
Converts the given Source into a PCollection that is
available to jobs run using this Pipeline instance.
- read(TableSource<K, V>) - Method in interface org.apache.crunch.Pipeline
-
A version of the read method for TableSource instances that map to
PTables.
- read(TableSource<K, V>, String) - Method in interface org.apache.crunch.Pipeline
-
A version of the read method for TableSource instances that map to
PTables.
- read(TaskInputOutputContext<?, ?, ?, ?>) - Method in interface org.apache.crunch.ReadableData
-
Read the data referenced by this instance within the given context.
- read(Source<T>) - Method in class org.apache.crunch.util.CrunchTool
-
- read(TableSource<K, V>) - Method in class org.apache.crunch.util.CrunchTool
-
- read(TaskInputOutputContext<?, ?, ?, ?>) - Method in class org.apache.crunch.util.DelegatingReadableData
-
- read(Configuration, Path) - Static method in class org.apache.crunch.util.DistCache
-
- read(TaskInputOutputContext<?, ?, ?, ?>) - Method in class org.apache.crunch.util.UnionReadableData
-
- ReadableData<T> - Interface in org.apache.crunch
-
Represents the contents of a data source that can be read on the cluster from within one
of the tasks running as part of a Crunch pipeline.
- ReadableSource<T> - Interface in org.apache.crunch.io
-
An extension of the Source interface that indicates that a
Source instance may be read as a series of records by the client
code.
- ReadableSourceTarget<T> - Interface in org.apache.crunch.io
-
An interface that indicates that a SourceTarget instance can be read
into the local client.
- ReaderWriterFactory - Interface in org.apache.crunch.types.avro
-
Interface for accessing DatumReader, DatumWriter, and Data classes.
- readFields(DataInput) - Method in class org.apache.crunch.contrib.io.jdbc.IdentifiableName
-
- readFields(ResultSet) - Method in class org.apache.crunch.contrib.io.jdbc.IdentifiableName
-
- readFields(DataInput) - Method in class org.apache.crunch.io.FormatBundle
-
- readFields(DataInput) - Method in class org.apache.crunch.types.writable.TupleWritable
- readFields(DataInput) - Method in class org.apache.crunch.types.writable.UnionWritable
-
- readTextFile(String) - Method in class org.apache.crunch.impl.dist.DistributedPipeline
-
- readTextFile(String) - Method in class org.apache.crunch.impl.mem.MemPipeline
-
- readTextFile(String) - Method in interface org.apache.crunch.Pipeline
-
A convenience method for reading a text file.
- readTextFile(String) - Method in class org.apache.crunch.util.CrunchTool
-
- records(Class<T>) - Static method in class org.apache.crunch.types.avro.Avros
-
- records(Class<T>) - Method in class org.apache.crunch.types.avro.AvroTypeFamily
-
- records(Class<T>) - Method in interface org.apache.crunch.types.PTypeFamily
-
- records(Class<T>) - Static method in class org.apache.crunch.types.writable.Writables
-
- records(Class<T>) - Method in class org.apache.crunch.types.writable.WritableTypeFamily
-
- reduce(PGroupedTable<K1, V1>, Class<? extends Reducer<K1, V1, K2, V2>>, Class<K2>, Class<V2>) - Static method in class org.apache.crunch.lib.Mapred
-
- reduce(PGroupedTable<K1, V1>, Class<? extends Reducer<K1, V1, K2, V2>>, Class<K2>, Class<V2>) - Static method in class org.apache.crunch.lib.Mapreduce
-
- ReduceGroupingFunction - Class in org.apache.crunch.impl.spark.fn
-
- ReduceGroupingFunction(GroupingOptions, PGroupedTableType, SparkRuntimeContext) - Constructor for class org.apache.crunch.impl.spark.fn.ReduceGroupingFunction
-
- ReduceInputFunction<K,V> - Class in org.apache.crunch.impl.spark.fn
-
- ReduceInputFunction(SerDe<K>, SerDe<V>) - Constructor for class org.apache.crunch.impl.spark.fn.ReduceInputFunction
-
- REFLECT - Static variable in class org.apache.crunch.types.avro.AvroMode
-
Default mode to use for reading and writing Reflect types.
- REFLECT_DATA_FACTORY - Static variable in class org.apache.crunch.types.avro.Avros
-
Deprecated.
as of 0.9.0; use AvroMode.REFLECT.override(ReaderWriterFactory)
- REFLECT_DATA_FACTORY_CLASS - Static variable in class org.apache.crunch.types.avro.Avros
-
The name of the configuration parameter that tracks which reflection
factory to use.
- ReflectDataFactory - Class in org.apache.crunch.types.avro
-
A Factory class for constructing Avro reflection-related objects.
- ReflectDataFactory() - Constructor for class org.apache.crunch.types.avro.ReflectDataFactory
-
- reflects(Class<T>) - Static method in class org.apache.crunch.types.avro.Avros
-
- reflects(Class<T>, Schema) - Static method in class org.apache.crunch.types.avro.Avros
-
- reflects(Class<T>) - Static method in class org.apache.crunch.types.orc.Orcs
-
Create a PType which uses reflection to serialize/deserialize java POJOs
to/from ORC.
- register(Class<T>, AvroType<T>) - Static method in class org.apache.crunch.types.avro.Avros
-
- register(Class<T>, WritableType<T, ? extends Writable>) - Static method in class org.apache.crunch.types.writable.Writables
-
- registerComparable(Class<? extends WritableComparable>) - Static method in class org.apache.crunch.types.writable.Writables
-
Registers a WritableComparable class so that it can be used for comparing the fields inside of
tuple types (e.g., pairs, trips, tupleN, etc.) for use in sorts and
secondary sorts.
- registerComparable(Class<? extends WritableComparable>, int) - Static method in class org.apache.crunch.types.writable.Writables
-
Registers a WritableComparable class with a given integer code to use for serializing
and deserializing instances of this class that are defined inside of tuple types (e.g., pairs,
trips, tupleN, etc.) Unregistered Writables are always serialized to bytes and
cannot be used in comparisons (e.g., sorts and secondary sorts) according to their underlying types.
- REJECT_ALL() - Static method in class org.apache.crunch.fn.FilterFns
-
Reject everything.
- remove() - Method in class org.apache.crunch.util.DoFnIterator
-
- replicas(int) - Method in class org.apache.crunch.CachingOptions.Builder
-
- replicas() - Method in class org.apache.crunch.CachingOptions
-
Returns the number of replicas of the data that should be maintained in the cache.
- requireSortedKeys() - Method in class org.apache.crunch.GroupingOptions.Builder
-
- requireSortedKeys() - Method in class org.apache.crunch.GroupingOptions
-
- reservoirSample(PCollection<T>, int) - Static method in class org.apache.crunch.lib.Sample
-
Select a fixed number of elements from the given PCollection with each element
equally likely to be included in the sample.
- reservoirSample(PCollection<T>, int, Long) - Static method in class org.apache.crunch.lib.Sample
-
A version of the reservoir sampling algorithm that uses a given seed, primarily for
testing purposes.
- reset() - Method in interface org.apache.crunch.Aggregator
-
Clears the internal state of this Aggregator and prepares it for the
values associated with the next key.
- results() - Method in interface org.apache.crunch.Aggregator
-
Returns the current aggregated state of this instance.
- ReverseAvroComparator<T> - Class in org.apache.crunch.lib.sort
-
- ReverseAvroComparator() - Constructor for class org.apache.crunch.lib.sort.ReverseAvroComparator
-
- ReverseWritableComparator<T> - Class in org.apache.crunch.lib.sort
-
- ReverseWritableComparator() - Constructor for class org.apache.crunch.lib.sort.ReverseWritableComparator
-
- rightJoin(PTable<K, U>, PTable<K, V>) - Static method in class org.apache.crunch.lib.Join
-
Performs a right outer join on the specified
PTables.
- RightOuterJoinFn<K,U,V> - Class in org.apache.crunch.lib.join
-
Used to perform the last step of an right outer join.
- RightOuterJoinFn(PType<K>, PType<U>) - Constructor for class org.apache.crunch.lib.join.RightOuterJoinFn
-
- run(String[]) - Method in class org.apache.crunch.examples.AverageBytesByIP
-
- run(String[]) - Method in class org.apache.crunch.examples.SecondarySortExample
-
- run(String[]) - Method in class org.apache.crunch.examples.SortExample
-
- run(String[]) - Method in class org.apache.crunch.examples.TotalBytesByIP
-
- run(String[]) - Method in class org.apache.crunch.examples.TotalWordCount
-
- run(String[]) - Method in class org.apache.crunch.examples.WordAggregationHBase
-
- run(String[]) - Method in class org.apache.crunch.examples.WordCount
-
- run() - Method in class org.apache.crunch.impl.mem.MemPipeline
-
- run() - Method in class org.apache.crunch.impl.mr.MRPipeline
-
- run() - Method in class org.apache.crunch.impl.spark.SparkPipeline
-
- run() - Method in interface org.apache.crunch.Pipeline
-
Constructs and executes a series of MapReduce jobs in order to write data
to the output targets.
- run() - Method in class org.apache.crunch.util.CrunchTool
-
- runAsync() - Method in class org.apache.crunch.impl.mem.MemPipeline
-
- runAsync() - Method in class org.apache.crunch.impl.mr.MRPipeline
-
- runAsync() - Method in class org.apache.crunch.impl.spark.SparkPipeline
-
- runAsync() - Method in interface org.apache.crunch.Pipeline
-
Constructs and starts a series of MapReduce jobs in order ot write data to
the output targets, but returns a ListenableFuture to allow clients to control
job execution.
- runAsync() - Method in class org.apache.crunch.util.CrunchTool
-
- runSingleThreaded() - Method in class org.apache.crunch.PipelineCallable
-
Override this method to indicate to the planner that this instance should not be run at the
same time as any other PipelineCallable instances.
- Sample - Class in org.apache.crunch.lib
-
Methods for performing random sampling in a distributed fashion, either by accepting each
record in a PCollection with an independent probability in order to sample some
fraction of the overall data set, or by using reservoir sampling in order to pull a uniform
or weighted sample of fixed size from a PCollection of an unknown size.
- Sample() - Constructor for class org.apache.crunch.lib.Sample
-
- sample(PCollection<S>, double) - Static method in class org.apache.crunch.lib.Sample
-
Output records from the given PCollection with the given probability.
- sample(PCollection<S>, Long, double) - Static method in class org.apache.crunch.lib.Sample
-
Output records from the given PCollection using a given seed.
- sample(PTable<K, V>, double) - Static method in class org.apache.crunch.lib.Sample
-
A PTable<K, V> analogue of the sample function.
- sample(PTable<K, V>, Long, double) - Static method in class org.apache.crunch.lib.Sample
-
A PTable<K, V> analogue of the sample function, with the seed argument
exposed for testing purposes.
- SAMPLE_UNIQUE_ELEMENTS(int) - Static method in class org.apache.crunch.fn.Aggregators
-
Collect a sample of unique elements from the input, where 'unique' is defined by
the equals method for the input objects.
- scaleFactor() - Method in class org.apache.crunch.DoFn
-
Returns an estimate of how applying this function to a
PCollection
will cause it to change in side.
- scaleFactor() - Method in class org.apache.crunch.FilterFn
-
- scaleFactor() - Method in class org.apache.crunch.MapFn
-
- SDoubleFlatMapFunction<T> - Class in org.apache.crunch.fn
-
A Crunch-compatible abstract base class for Spark's DoubleFlatMapFunction.
- SDoubleFlatMapFunction() - Constructor for class org.apache.crunch.fn.SDoubleFlatMapFunction
-
- SDoubleFunction<T> - Class in org.apache.crunch.fn
-
A Crunch-compatible abstract base class for Spark's DoubleFunction.
- SDoubleFunction() - Constructor for class org.apache.crunch.fn.SDoubleFunction
-
- second() - Method in class org.apache.crunch.Pair
-
- second() - Method in class org.apache.crunch.Tuple3
-
- second() - Method in class org.apache.crunch.Tuple4
-
- SecondarySort - Class in org.apache.crunch.lib
-
Utilities for performing a secondary sort on a PTable<K, Pair<V1, V2>> collection.
- SecondarySort() - Constructor for class org.apache.crunch.lib.SecondarySort
-
- SecondarySortExample - Class in org.apache.crunch.examples
-
- SecondarySortExample() - Constructor for class org.apache.crunch.examples.SecondarySortExample
-
- sequenceFile(String, Class<T>) - Static method in class org.apache.crunch.io.At
-
Creates a SourceTarget<T> instance from the SequenceFile(s) at the given path name
from the value field of each key-value pair in the SequenceFile(s).
- sequenceFile(Path, Class<T>) - Static method in class org.apache.crunch.io.At
-
Creates a SourceTarget<T> instance from the SequenceFile(s) at the given Path
from the value field of each key-value pair in the SequenceFile(s).
- sequenceFile(String, PType<T>) - Static method in class org.apache.crunch.io.At
-
Creates a SourceTarget<T> instance from the SequenceFile(s) at the given path name
from the value field of each key-value pair in the SequenceFile(s).
- sequenceFile(Path, PType<T>) - Static method in class org.apache.crunch.io.At
-
Creates a SourceTarget<T> instance from the SequenceFile(s) at the given Path
from the value field of each key-value pair in the SequenceFile(s).
- sequenceFile(String, Class<K>, Class<V>) - Static method in class org.apache.crunch.io.At
-
Creates a TableSourceTarget<K, V> instance from the SequenceFile(s) at the given path name
from the key-value pairs in the SequenceFile(s).
- sequenceFile(Path, Class<K>, Class<V>) - Static method in class org.apache.crunch.io.At
-
Creates a TableSourceTarget<K, V> instance from the SequenceFile(s) at the given Path
from the key-value pairs in the SequenceFile(s).
- sequenceFile(String, PType<K>, PType<V>) - Static method in class org.apache.crunch.io.At
-
Creates a TableSourceTarget<K, V> instance from the SequenceFile(s) at the given path name
from the key-value pairs in the SequenceFile(s).
- sequenceFile(Path, PType<K>, PType<V>) - Static method in class org.apache.crunch.io.At
-
Creates a TableSourceTarget<K, V> instance from the SequenceFile(s) at the given Path
from the key-value pairs in the SequenceFile(s).
- sequenceFile(String, Class<T>) - Static method in class org.apache.crunch.io.From
-
Creates a Source<T> instance from the SequenceFile(s) at the given path name
from the value field of each key-value pair in the SequenceFile(s).
- sequenceFile(Path, Class<T>) - Static method in class org.apache.crunch.io.From
-
Creates a Source<T> instance from the SequenceFile(s) at the given Path
from the value field of each key-value pair in the SequenceFile(s).
- sequenceFile(List<Path>, Class<T>) - Static method in class org.apache.crunch.io.From
-
Creates a Source<T> instance from the SequenceFile(s) at the given Paths
from the value field of each key-value pair in the SequenceFile(s).
- sequenceFile(String, PType<T>) - Static method in class org.apache.crunch.io.From
-
Creates a Source<T> instance from the SequenceFile(s) at the given path name
from the value field of each key-value pair in the SequenceFile(s).
- sequenceFile(Path, PType<T>) - Static method in class org.apache.crunch.io.From
-
Creates a Source<T> instance from the SequenceFile(s) at the given Path
from the value field of each key-value pair in the SequenceFile(s).
- sequenceFile(List<Path>, PType<T>) - Static method in class org.apache.crunch.io.From
-
Creates a Source<T> instance from the SequenceFile(s) at the given Paths
from the value field of each key-value pair in the SequenceFile(s).
- sequenceFile(String, Class<K>, Class<V>) - Static method in class org.apache.crunch.io.From
-
Creates a TableSource<K, V> instance for the SequenceFile(s) at the given path name.
- sequenceFile(Path, Class<K>, Class<V>) - Static method in class org.apache.crunch.io.From
-
Creates a TableSource<K, V> instance for the SequenceFile(s) at the given Path.
- sequenceFile(List<Path>, Class<K>, Class<V>) - Static method in class org.apache.crunch.io.From
-
Creates a TableSource<K, V> instance for the SequenceFile(s) at the given Paths.
- sequenceFile(String, PType<K>, PType<V>) - Static method in class org.apache.crunch.io.From
-
Creates a TableSource<K, V> instance for the SequenceFile(s) at the given path name.
- sequenceFile(Path, PType<K>, PType<V>) - Static method in class org.apache.crunch.io.From
-
Creates a TableSource<K, V> instance for the SequenceFile(s) at the given Path.
- sequenceFile(List<Path>, PType<K>, PType<V>) - Static method in class org.apache.crunch.io.From
-
Creates a TableSource<K, V> instance for the SequenceFile(s) at the given Paths.
- sequenceFile(String) - Static method in class org.apache.crunch.io.To
-
Creates a Target at the given path name that writes data to
SequenceFiles.
- sequenceFile(Path) - Static method in class org.apache.crunch.io.To
-
Creates a Target at the given Path that writes data to
SequenceFiles.
- sequentialDo(String, PipelineCallable<Output>) - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
-
- sequentialDo(PipelineCallable<Output>) - Method in class org.apache.crunch.impl.dist.DistributedPipeline
-
- sequentialDo(PipelineCallable<Output>) - Method in class org.apache.crunch.impl.mem.MemPipeline
-
- sequentialDo(String, PipelineCallable<Output>) - Method in interface org.apache.crunch.PCollection
-
Adds the materialized data in this PCollection as a dependency to the given
PipelineCallable and registers it with the Pipeline associated with this
instance.
- sequentialDo(PipelineCallable<Output>) - Method in interface org.apache.crunch.Pipeline
-
Executes the given PipelineCallable on the client after the Targets
that the PipelineCallable depends on (if any) have been created by other pipeline
processing steps.
- SequentialFileNamingScheme - Class in org.apache.crunch.io
-
Default
FileNamingScheme that uses an incrementing sequence number in
order to generate unique file names.
- SerDe<T> - Interface in org.apache.crunch.impl.spark.serde
-
- SerDeFactory - Class in org.apache.crunch.impl.spark.serde
-
- SerDeFactory() - Constructor for class org.apache.crunch.impl.spark.serde.SerDeFactory
-
- SerializableSupplier<T> - Interface in org.apache.crunch.util
-
An extension of Guava's
Supplier interface that indicates that an instance
will also implement
Serializable, which makes this object suitable for use
with Crunch's DoFns when we need to construct an instance of a non-serializable
type for use in processing.
- serialize() - Method in class org.apache.crunch.io.FormatBundle
-
- set(String, String) - Method in class org.apache.crunch.io.FormatBundle
-
- Set - Class in org.apache.crunch.lib
-
Utilities for performing set operations (difference, intersection, etc) on
PCollection instances.
- Set() - Constructor for class org.apache.crunch.lib.Set
-
- set(int, Writable) - Method in class org.apache.crunch.types.writable.TupleWritable
-
- setBreakpoint() - Method in class org.apache.crunch.impl.dist.collect.BaseUnionCollection
-
- setBreakpoint() - Method in class org.apache.crunch.impl.dist.collect.BaseUnionTable
-
- setBreakpoint() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
-
- setCombineFn(CombineFn) - Method in class org.apache.crunch.impl.spark.SparkRuntime
-
- setConf(Broadcast<byte[]>) - Method in class org.apache.crunch.impl.spark.SparkRuntimeContext
-
- setConf(Configuration) - Method in class org.apache.crunch.io.FormatBundle
-
- setConf(Configuration) - Method in class org.apache.crunch.lib.join.JoinUtils.AvroPairGroupingComparator
-
- setConf(Configuration) - Method in class org.apache.crunch.lib.sort.ReverseAvroComparator
-
- setConf(Configuration) - Method in class org.apache.crunch.lib.sort.ReverseWritableComparator
-
- setConf(Configuration) - Method in class org.apache.crunch.lib.sort.TotalOrderPartitioner
-
- setConf(Configuration) - Method in class org.apache.crunch.lib.sort.TupleWritableComparator
-
- setConf(Configuration) - Method in class org.apache.crunch.types.writable.TupleWritable.Comparator
-
- setConf(Configuration) - Method in class org.apache.crunch.types.writable.TupleWritable
-
- setConf(Configuration) - Method in class org.apache.crunch.util.CrunchTool
-
- setConfiguration(Configuration) - Method in class org.apache.crunch.DoFn
-
Called during the setup of an initialized
PType that
relies on this instance.
- setConfiguration(Configuration) - Method in class org.apache.crunch.fn.CompositeMapFn
-
- setConfiguration(Configuration) - Method in class org.apache.crunch.fn.ExtractKeyFn
-
- setConfiguration(Configuration) - Method in class org.apache.crunch.fn.PairMapFn
-
- setConfiguration(Configuration) - Method in class org.apache.crunch.impl.dist.DistributedPipeline
-
- setConfiguration(Configuration) - Method in class org.apache.crunch.impl.mem.MemPipeline
-
- setConfiguration(Configuration) - Method in interface org.apache.crunch.Pipeline
-
Set the Configuration to use with this pipeline.
- setContext(TaskInputOutputContext<?, ?, ?, ?>) - Method in class org.apache.crunch.DoFn
-
Called during setup to pass the TaskInputOutputContext to this
DoFn instance.
- setContext(TaskInputOutputContext<?, ?, ?, ?>) - Method in class org.apache.crunch.fn.CompositeMapFn
-
- setContext(TaskInputOutputContext<?, ?, ?, ?>) - Method in class org.apache.crunch.fn.ExtractKeyFn
-
- setContext(TaskInputOutputContext<?, ?, ?, ?>) - Method in class org.apache.crunch.fn.PairMapFn
-
- setContext(TaskInputOutputContext<?, ?, ?, ?>) - Method in class org.apache.crunch.types.PGroupedTableType.PairIterableMapFn
-
- setMessage(String) - Method in class org.apache.crunch.PipelineCallable
-
Sets a message associated with this callable's execution, especially in case of errors.
- setPartitionFile(Configuration, Path) - Static method in class org.apache.crunch.lib.sort.TotalOrderPartitioner
-
- setSpecificClassLoader(ClassLoader) - Static method in class org.apache.crunch.types.avro.AvroMode
-
Set the ClassLoader that will be used for loading Avro org.apache.avro.specific.SpecificRecord
and reflection implementation classes.
- setValue(long) - Method in class org.apache.hadoop.mapred.SparkCounter
-
- SFlatMapFunction<T,R> - Class in org.apache.crunch.fn
-
A Crunch-compatible abstract base class for Spark's FlatMapFunction.
- SFlatMapFunction() - Constructor for class org.apache.crunch.fn.SFlatMapFunction
-
- SFlatMapFunction2<K,V,R> - Class in org.apache.crunch.fn
-
A Crunch-compatible abstract base class for Spark's FlatMapFunction2.
- SFlatMapFunction2() - Constructor for class org.apache.crunch.fn.SFlatMapFunction2
-
- SFunction<T,R> - Class in org.apache.crunch.fn
-
A Crunch-compatible abstract base class for Spark's Function.
- SFunction() - Constructor for class org.apache.crunch.fn.SFunction
-
- SFunction2<K,V,R> - Class in org.apache.crunch.fn
-
A Crunch-compatible abstract base class for Spark's Function2.
- SFunction2() - Constructor for class org.apache.crunch.fn.SFunction2
-
- SFunctions - Class in org.apache.crunch.fn
-
Utility methods for wrapping existing Spark Java API Functions for
Crunch compatibility.
- Shard - Class in org.apache.crunch.lib
-
Utilities for controlling how the data in a PCollection is balanced across reducers
and output files.
- Shard() - Constructor for class org.apache.crunch.lib.Shard
-
- shard(PCollection<T>, int) - Static method in class org.apache.crunch.lib.Shard
-
Creates a PCollection<T> that has the same contents as its input argument but will
be written to a fixed number of output files.
- ShardedJoinStrategy<K,U,V> - Class in org.apache.crunch.lib.join
-
JoinStrategy that splits the key space up into shards.
- ShardedJoinStrategy(int) - Constructor for class org.apache.crunch.lib.join.ShardedJoinStrategy
-
Instantiate with a constant number of shards to use for all keys.
- ShardedJoinStrategy(ShardedJoinStrategy.ShardingStrategy<K>) - Constructor for class org.apache.crunch.lib.join.ShardedJoinStrategy
-
Instantiate with a custom sharding strategy.
- ShardedJoinStrategy.ShardingStrategy<K> - Interface in org.apache.crunch.lib.join
-
Determines over how many shards a key will be split in a sharded join.
- SingleUseIterable<T> - Class in org.apache.crunch.impl
-
Wrapper around a Reducer's input Iterable.
- SingleUseIterable(Iterable<T>) - Constructor for class org.apache.crunch.impl.SingleUseIterable
-
Instantiate around an Iterable that may only be used once.
- size() - Method in class org.apache.crunch.Pair
-
- size() - Method in interface org.apache.crunch.Tuple
-
Returns the number of elements in this Tuple.
- size() - Method in class org.apache.crunch.Tuple3
-
- size() - Method in class org.apache.crunch.Tuple4
-
- size() - Method in class org.apache.crunch.TupleN
-
- size() - Method in class org.apache.crunch.types.writable.TupleWritable
-
The number of children in this Tuple.
- skip(String) - Method in class org.apache.crunch.contrib.text.TokenizerFactory.Builder
-
Sets the regular expression that determines which input characters should be
ignored by the Scanner that is returned by the constructed
TokenizerFactory.
- snappy(T) - Static method in class org.apache.crunch.io.Compress
-
Configure the given output target to be compressed using Snappy.
- Sort - Class in org.apache.crunch.lib
-
Utilities for sorting PCollection instances.
- Sort() - Constructor for class org.apache.crunch.lib.Sort
-
- sort(PCollection<T>) - Static method in class org.apache.crunch.lib.Sort
-
Sorts the PCollection using the natural ordering of its elements in ascending order.
- sort(PCollection<T>, Sort.Order) - Static method in class org.apache.crunch.lib.Sort
-
Sorts the PCollection using the natural order of its elements with the given Order.
- sort(PCollection<T>, int, Sort.Order) - Static method in class org.apache.crunch.lib.Sort
-
Sorts the PCollection using the natural ordering of its elements in
the order specified using the given number of reducers.
- sort(PTable<K, V>) - Static method in class org.apache.crunch.lib.Sort
-
Sorts the PTable using the natural ordering of its keys in ascending order.
- sort(PTable<K, V>, Sort.Order) - Static method in class org.apache.crunch.lib.Sort
-
Sorts the PTable using the natural ordering of its keys with the given Order.
- sort(PTable<K, V>, int, Sort.Order) - Static method in class org.apache.crunch.lib.Sort
-
Sorts the PTable using the natural ordering of its keys in the
order specified with a client-specified number of reducers.
- Sort.ColumnOrder - Class in org.apache.crunch.lib
-
To sort by column 2 ascending then column 1 descending, you would use:
sortPairs(coll, by(2, ASCENDING), by(1, DESCENDING))
Column numbering is 1-based.
- Sort.ColumnOrder(int, Sort.Order) - Constructor for class org.apache.crunch.lib.Sort.ColumnOrder
-
- Sort.Order - Enum in org.apache.crunch.lib
-
For signaling the order in which a sort should be done.
- sortAndApply(PTable<K, Pair<V1, V2>>, DoFn<Pair<K, Iterable<Pair<V1, V2>>>, T>, PType<T>) - Static method in class org.apache.crunch.lib.SecondarySort
-
Perform a secondary sort on the given PTable instance and then apply a
DoFn to the resulting sorted data to yield an output PCollection<T>.
- sortAndApply(PTable<K, Pair<V1, V2>>, DoFn<Pair<K, Iterable<Pair<V1, V2>>>, T>, PType<T>, int) - Static method in class org.apache.crunch.lib.SecondarySort
-
Perform a secondary sort on the given PTable instance and then apply a
DoFn to the resulting sorted data to yield an output PCollection<T>, using
the given number of reducers.
- sortAndApply(PTable<K, Pair<V1, V2>>, DoFn<Pair<K, Iterable<Pair<V1, V2>>>, Pair<U, V>>, PTableType<U, V>) - Static method in class org.apache.crunch.lib.SecondarySort
-
Perform a secondary sort on the given PTable instance and then apply a
DoFn to the resulting sorted data to yield an output PTable<U, V>.
- sortAndApply(PTable<K, Pair<V1, V2>>, DoFn<Pair<K, Iterable<Pair<V1, V2>>>, Pair<U, V>>, PTableType<U, V>, int) - Static method in class org.apache.crunch.lib.SecondarySort
-
Perform a secondary sort on the given PTable instance and then apply a
DoFn to the resulting sorted data to yield an output PTable<U, V>, using
the given number of reducers.
- sortComparatorClass(Class<? extends RawComparator>) - Method in class org.apache.crunch.GroupingOptions.Builder
-
- SortExample - Class in org.apache.crunch.examples
-
Simple Crunch tool for running sorting examples from the command line.
- SortExample() - Constructor for class org.apache.crunch.examples.SortExample
-
- SortFns - Class in org.apache.crunch.lib.sort
-
A set of DoFns that are used by Crunch's Sort library.
- SortFns() - Constructor for class org.apache.crunch.lib.sort.SortFns
-
- SortFns.AvroGenericFn<V extends Tuple> - Class in org.apache.crunch.lib.sort
-
Pulls a composite set of keys from an Avro GenericRecord instance.
- SortFns.AvroGenericFn(int[], Schema) - Constructor for class org.apache.crunch.lib.sort.SortFns.AvroGenericFn
-
- SortFns.KeyExtraction<V extends Tuple> - Class in org.apache.crunch.lib.sort
-
Utility class for encapsulating key extraction logic and serialization information about
key extraction.
- SortFns.KeyExtraction(PType<V>, Sort.ColumnOrder[]) - Constructor for class org.apache.crunch.lib.sort.SortFns.KeyExtraction
-
- SortFns.SingleKeyFn<V extends Tuple,K> - Class in org.apache.crunch.lib.sort
-
Extracts a single indexed key from a Tuple instance.
- SortFns.SingleKeyFn(int) - Constructor for class org.apache.crunch.lib.sort.SortFns.SingleKeyFn
-
- SortFns.TupleKeyFn<V extends Tuple,K extends Tuple> - Class in org.apache.crunch.lib.sort
-
Extracts a composite key from a Tuple instance.
- SortFns.TupleKeyFn(int[], TupleFactory) - Constructor for class org.apache.crunch.lib.sort.SortFns.TupleKeyFn
-
- sortPairs(PCollection<Pair<U, V>>, Sort.ColumnOrder...) - Static method in class org.apache.crunch.lib.Sort
-
Sorts the PCollection of Pairs using the specified column
ordering.
- sortQuads(PCollection<Tuple4<V1, V2, V3, V4>>, Sort.ColumnOrder...) - Static method in class org.apache.crunch.lib.Sort
-
Sorts the PCollection of Tuple4s using the specified column
ordering.
- sortTriples(PCollection<Tuple3<V1, V2, V3>>, Sort.ColumnOrder...) - Static method in class org.apache.crunch.lib.Sort
-
Sorts the PCollection of Tuple3s using the specified column
ordering.
- sortTuples(PCollection<T>, Sort.ColumnOrder...) - Static method in class org.apache.crunch.lib.Sort
-
Sorts the PCollection of tuples using the specified column ordering.
- sortTuples(PCollection<T>, int, Sort.ColumnOrder...) - Static method in class org.apache.crunch.lib.Sort
-
Sorts the
PCollection of
TupleNs using the specified column
ordering and a client-specified number of reducers.
- Source<T> - Interface in org.apache.crunch
-
A Source represents an input data set that is an input to one or more
MapReduce jobs.
- sources(Source<?>...) - Method in class org.apache.crunch.ParallelDoOptions.Builder
-
- sources(Collection<Source<?>>) - Method in class org.apache.crunch.ParallelDoOptions.Builder
-
- sourceTarget(SourceTarget<?>) - Method in class org.apache.crunch.GroupingOptions.Builder
-
Deprecated.
- SourceTarget<T> - Interface in org.apache.crunch
-
An interface for classes that implement both the Source and the
Target interfaces.
- SourceTargetHelper - Class in org.apache.crunch.io
-
Functions for configuring the inputs/outputs of MapReduce jobs.
- SourceTargetHelper() - Constructor for class org.apache.crunch.io.SourceTargetHelper
-
- sourceTargets(SourceTarget<?>...) - Method in class org.apache.crunch.GroupingOptions.Builder
-
- sourceTargets(Collection<SourceTarget<?>>) - Method in class org.apache.crunch.GroupingOptions.Builder
-
- sourceTargets(SourceTarget<?>...) - Method in class org.apache.crunch.ParallelDoOptions.Builder
-
- sourceTargets(Collection<SourceTarget<?>>) - Method in class org.apache.crunch.ParallelDoOptions.Builder
-
- SPairFlatMapFunction<T,K,V> - Class in org.apache.crunch.fn
-
A Crunch-compatible abstract base class for Spark's PairFlatMapFunction.
- SPairFlatMapFunction() - Constructor for class org.apache.crunch.fn.SPairFlatMapFunction
-
- SPairFunction<T,K,V> - Class in org.apache.crunch.fn
-
A Crunch-compatible abstract base class for Spark's PairFunction.
- SPairFunction() - Constructor for class org.apache.crunch.fn.SPairFunction
-
- SparkCollectFactory - Class in org.apache.crunch.impl.spark.collect
-
- SparkCollectFactory() - Constructor for class org.apache.crunch.impl.spark.collect.SparkCollectFactory
-
- SparkCollection - Interface in org.apache.crunch.impl.spark
-
- SparkComparator - Class in org.apache.crunch.impl.spark
-
- SparkComparator(GroupingOptions, PGroupedTableType, SparkRuntimeContext) - Constructor for class org.apache.crunch.impl.spark.SparkComparator
-
- SparkCounter - Class in org.apache.hadoop.mapred
-
- SparkCounter(String, String, Accumulator<Map<String, Map<String, Long>>>) - Constructor for class org.apache.hadoop.mapred.SparkCounter
-
- SparkCounter(String, String, long) - Constructor for class org.apache.hadoop.mapred.SparkCounter
-
- SparkPartitioner - Class in org.apache.crunch.impl.spark
-
- SparkPartitioner(int) - Constructor for class org.apache.crunch.impl.spark.SparkPartitioner
-
- SparkPipeline - Class in org.apache.crunch.impl.spark
-
- SparkPipeline(String, String) - Constructor for class org.apache.crunch.impl.spark.SparkPipeline
-
- SparkPipeline(String, String, Class<?>) - Constructor for class org.apache.crunch.impl.spark.SparkPipeline
-
- SparkPipeline(String, String, Class<?>, Configuration) - Constructor for class org.apache.crunch.impl.spark.SparkPipeline
-
- SparkPipeline(JavaSparkContext, String) - Constructor for class org.apache.crunch.impl.spark.SparkPipeline
-
- SparkPipeline(JavaSparkContext, String, Class<?>, Configuration) - Constructor for class org.apache.crunch.impl.spark.SparkPipeline
-
- SparkRuntime - Class in org.apache.crunch.impl.spark
-
- SparkRuntime(SparkPipeline, JavaSparkContext, Configuration, Map<PCollectionImpl<?>, Set<Target>>, Map<PCollectionImpl<?>, MaterializableIterable>, Map<PCollection<?>, StorageLevel>, Map<PipelineCallable<?>, Set<Target>>) - Constructor for class org.apache.crunch.impl.spark.SparkRuntime
-
- SparkRuntimeContext - Class in org.apache.crunch.impl.spark
-
- SparkRuntimeContext(String, Accumulator<Map<String, Map<String, Long>>>, Broadcast<byte[]>) - Constructor for class org.apache.crunch.impl.spark.SparkRuntimeContext
-
- SPECIFIC - Static variable in class org.apache.crunch.types.avro.AvroMode
-
Default mode to use for reading and writing Specific types.
- specifics(Class<T>) - Static method in class org.apache.crunch.types.avro.Avros
-
- split(PCollection<Pair<T, U>>) - Static method in class org.apache.crunch.lib.Channels
-
Splits a
PCollection of any
Pair of objects into a Pair of
PCollection}, to allow for the output of a DoFn to be handled using
separate channels.
- split(PCollection<Pair<T, U>>, PType<T>, PType<U>) - Static method in class org.apache.crunch.lib.Channels
-
Splits a
PCollection of any
Pair of objects into a Pair of
PCollection}, to allow for the output of a DoFn to be handled using
separate channels.
- status - Variable in class org.apache.crunch.PipelineResult
-
- STRING_CONCAT(String, boolean) - Static method in class org.apache.crunch.fn.Aggregators
-
Concatenate strings, with a separator between strings.
- STRING_CONCAT(String, boolean, long, long) - Static method in class org.apache.crunch.fn.Aggregators
-
Concatenate strings, with a separator between strings.
- STRING_TO_UTF8 - Static variable in class org.apache.crunch.types.avro.Avros
-
- strings() - Static method in class org.apache.crunch.types.avro.Avros
-
- strings() - Method in class org.apache.crunch.types.avro.AvroTypeFamily
-
- strings() - Method in interface org.apache.crunch.types.PTypeFamily
-
- strings() - Static method in class org.apache.crunch.types.writable.Writables
-
- strings() - Method in class org.apache.crunch.types.writable.WritableTypeFamily
-
- succeeded() - Method in class org.apache.crunch.PipelineResult
-
- SUM_BIGINTS() - Static method in class org.apache.crunch.fn.Aggregators
-
- SUM_DOUBLES() - Static method in class org.apache.crunch.fn.Aggregators
-
Sum up all double values.
- SUM_FLOATS() - Static method in class org.apache.crunch.fn.Aggregators
-
Sum up all float values.
- SUM_INTS() - Static method in class org.apache.crunch.fn.Aggregators
-
Sum up all int values.
- SUM_LONGS() - Static method in class org.apache.crunch.fn.Aggregators
-
Sum up all long values.
- SwapFn<V1,V2> - Class in org.apache.crunch.fn
-
Swap the elements of a Pair type.
- SwapFn() - Constructor for class org.apache.crunch.fn.SwapFn
-
- swapKeyValue(PTable<K, V>) - Static method in class org.apache.crunch.lib.PTables
-
Swap the key and value part of a table.
- tableOf(S, T, Object...) - Static method in class org.apache.crunch.impl.mem.MemPipeline
-
- tableOf(Iterable<Pair<S, T>>) - Static method in class org.apache.crunch.impl.mem.MemPipeline
-
- tableOf(PType<K>, PType<V>) - Static method in class org.apache.crunch.types.avro.Avros
-
A table type with an Avro type as key and as value.
- tableOf(PType<K>, PType<V>) - Method in class org.apache.crunch.types.avro.AvroTypeFamily
-
- tableOf(PType<K>, PType<V>) - Method in interface org.apache.crunch.types.PTypeFamily
-
- tableOf(PType<K>, PType<V>) - Static method in class org.apache.crunch.types.writable.Writables
-
- tableOf(PType<K>, PType<V>) - Method in class org.apache.crunch.types.writable.WritableTypeFamily
-
- TableSource<K,V> - Interface in org.apache.crunch
-
The interface
Source implementations that return a
PTable.
- TableSourceTarget<K,V> - Interface in org.apache.crunch
-
An interface for classes that implement both the TableSource and the
Target interfaces.
- tableType(PTableType<K, V>) - Static method in class org.apache.crunch.fn.SwapFn
-
- Target - Interface in org.apache.crunch
-
A Target represents the output destination of a Crunch PCollection
in the context of a Crunch job.
- Target.WriteMode - Enum in org.apache.crunch
-
An enum to represent different options the client may specify
for handling the case where the output path, table, etc.
- targets(Target...) - Method in class org.apache.crunch.ParallelDoOptions.Builder
-
- targets(Collection<Target>) - Method in class org.apache.crunch.ParallelDoOptions.Builder
-
- tempDir - Variable in class org.apache.crunch.test.CrunchTestSupport
-
- TemporaryPath - Class in org.apache.crunch.test
-
Creates a temporary directory for a test case and destroys it afterwards.
- TemporaryPath(String...) - Constructor for class org.apache.crunch.test.TemporaryPath
-
Construct TemporaryPath.
- TestCounters - Class in org.apache.crunch.test
-
A utility class used during unit testing to update and read counters.
- TestCounters() - Constructor for class org.apache.crunch.test.TestCounters
-
- textFile(String) - Static method in class org.apache.crunch.io.At
-
Creates a SourceTarget<String> instance for the text file(s) at the given path name.
- textFile(Path) - Static method in class org.apache.crunch.io.At
-
Creates a SourceTarget<String> instance for the text file(s) at the given Path.
- textFile(String, PType<T>) - Static method in class org.apache.crunch.io.At
-
Creates a SourceTarget<T> instance for the text file(s) at the given path name using
the provided PType<T> to convert the input text.
- textFile(Path, PType<T>) - Static method in class org.apache.crunch.io.At
-
Creates a SourceTarget<T> instance for the text file(s) at the given Path using
the provided PType<T> to convert the input text.
- textFile(String) - Static method in class org.apache.crunch.io.From
-
Creates a Source<String> instance for the text file(s) at the given path name.
- textFile(Path) - Static method in class org.apache.crunch.io.From
-
Creates a Source<String> instance for the text file(s) at the given Path.
- textFile(List<Path>) - Static method in class org.apache.crunch.io.From
-
Creates a Source<String> instance for the text file(s) at the given Paths.
- textFile(String, PType<T>) - Static method in class org.apache.crunch.io.From
-
Creates a Source<T> instance for the text file(s) at the given path name using
the provided PType<T> to convert the input text.
- textFile(Path, PType<T>) - Static method in class org.apache.crunch.io.From
-
Creates a Source<T> instance for the text file(s) at the given Path using
the provided PType<T> to convert the input text.
- textFile(List<Path>, PType<T>) - Static method in class org.apache.crunch.io.From
-
Creates a Source<T> instance for the text file(s) at the given Paths using
the provided PType<T> to convert the input text.
- textFile(String) - Static method in class org.apache.crunch.io.To
-
Creates a Target at the given path name that writes data to
text files.
- textFile(Path) - Static method in class org.apache.crunch.io.To
-
Creates a Target at the given Path that writes data to
text files.
- third() - Method in class org.apache.crunch.Tuple3
-
- third() - Method in class org.apache.crunch.Tuple4
-
- thrifts(Class<T>, PTypeFamily) - Static method in class org.apache.crunch.types.PTypes
-
Constructs a PType for a Thrift record.
- To - Class in org.apache.crunch.io
-
Static factory methods for creating common
Target types.
- To() - Constructor for class org.apache.crunch.io.To
-
- ToByteArrayFunction - Class in org.apache.crunch.impl.spark.collect
-
- ToByteArrayFunction() - Constructor for class org.apache.crunch.impl.spark.collect.ToByteArrayFunction
-
- toBytes(T) - Method in class org.apache.crunch.impl.spark.serde.AvroSerDe
-
- toBytes(T) - Method in interface org.apache.crunch.impl.spark.serde.SerDe
-
- toBytes(Writable) - Method in class org.apache.crunch.impl.spark.serde.WritableSerDe
-
- toCombineFn(Aggregator<V>) - Static method in class org.apache.crunch.fn.Aggregators
-
- toCombineFn(Aggregator<V>, PType<V>) - Static method in class org.apache.crunch.fn.Aggregators
-
Wrap a
CombineFn adapter around the given aggregator.
- Tokenizer - Class in org.apache.crunch.contrib.text
-
Manages a
Scanner instance and provides support for returning only a subset
of the fields returned by the underlying
Scanner.
- Tokenizer(Scanner, Set<Integer>, boolean) - Constructor for class org.apache.crunch.contrib.text.Tokenizer
-
Create a new Tokenizer instance.
- TokenizerFactory - Class in org.apache.crunch.contrib.text
-
Factory class that constructs
Tokenizer instances for input strings that use a fixed
set of delimiters, skip patterns, locales, and sets of indices to keep or drop.
- TokenizerFactory.Builder - Class in org.apache.crunch.contrib.text
-
A class for constructing new TokenizerFactory instances using the Builder pattern.
- TokenizerFactory.Builder() - Constructor for class org.apache.crunch.contrib.text.TokenizerFactory.Builder
-
- top(int) - Method in class org.apache.crunch.impl.dist.collect.PTableBase
-
- top(PTable<K, V>, int, boolean) - Static method in class org.apache.crunch.lib.Aggregate
-
Selects the top N pairs from the given table, with sorting being performed on the values (i.e.
- top(int) - Method in interface org.apache.crunch.PTable
-
Returns a PTable made up of the pairs in this PTable with the largest value
field.
- TopList - Class in org.apache.crunch.lib
-
Tools for creating top lists of items in PTables and PCollections
- TopList() - Constructor for class org.apache.crunch.lib.TopList
-
- topNYbyX(PTable<X, Y>, int) - Static method in class org.apache.crunch.lib.TopList
-
Create a top-list of elements in the provided PTable, categorised by the key of the input table and using the count
of the value part of the input table.
- toString() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
-
- toString() - Method in class org.apache.crunch.lib.Sort.ColumnOrder
-
- toString() - Method in class org.apache.crunch.Pair
-
- toString() - Method in class org.apache.crunch.Tuple3
-
- toString() - Method in class org.apache.crunch.Tuple4
-
- toString() - Method in class org.apache.crunch.TupleN
-
- toString() - Method in class org.apache.crunch.types.writable.TupleWritable
-
Convert Tuple to String as in the following.
- TotalBytesByIP - Class in org.apache.crunch.examples
-
- TotalBytesByIP() - Constructor for class org.apache.crunch.examples.TotalBytesByIP
-
- TotalOrderPartitioner<K,V> - Class in org.apache.crunch.lib.sort
-
A partition-aware Partitioner instance that can work with either Avro or Writable-formatted
keys.
- TotalOrderPartitioner() - Constructor for class org.apache.crunch.lib.sort.TotalOrderPartitioner
-
- TotalWordCount - Class in org.apache.crunch.examples
-
- TotalWordCount() - Constructor for class org.apache.crunch.examples.TotalWordCount
-
- tripAggregator(Aggregator<V1>, Aggregator<V2>, Aggregator<V3>) - Static method in class org.apache.crunch.fn.Aggregators
-
Apply separate aggregators to each component of a
Tuple3.
- triples(PType<V1>, PType<V2>, PType<V3>) - Static method in class org.apache.crunch.types.avro.Avros
-
- triples(PType<V1>, PType<V2>, PType<V3>) - Method in class org.apache.crunch.types.avro.AvroTypeFamily
-
- triples(PType<V1>, PType<V2>, PType<V3>) - Method in interface org.apache.crunch.types.PTypeFamily
-
- triples(PType<V1>, PType<V2>, PType<V3>) - Static method in class org.apache.crunch.types.writable.Writables
-
- triples(PType<V1>, PType<V2>, PType<V3>) - Method in class org.apache.crunch.types.writable.WritableTypeFamily
-
- Tuple - Interface in org.apache.crunch
-
A fixed-size collection of Objects, used in Crunch for representing joins
between PCollections.
- Tuple2MapFunction<K,V> - Class in org.apache.crunch.impl.spark.fn
-
- Tuple2MapFunction(MapFn<Pair<K, V>, Pair<K, V>>, SparkRuntimeContext) - Constructor for class org.apache.crunch.impl.spark.fn.Tuple2MapFunction
-
- tuple2PairFunc() - Static method in class org.apache.crunch.impl.spark.GuavaUtils
-
- Tuple3<V1,V2,V3> - Class in org.apache.crunch
-
A convenience class for three-element
Tuples.
- Tuple3(V1, V2, V3) - Constructor for class org.apache.crunch.Tuple3
-
- TUPLE3 - Static variable in class org.apache.crunch.types.TupleFactory
-
- Tuple3.Collect<V1,V2,V3> - Class in org.apache.crunch
-
- Tuple3.Collect(Collection<V1>, Collection<V2>, Collection<V3>) - Constructor for class org.apache.crunch.Tuple3.Collect
-
- Tuple4<V1,V2,V3,V4> - Class in org.apache.crunch
-
A convenience class for four-element
Tuples.
- Tuple4(V1, V2, V3, V4) - Constructor for class org.apache.crunch.Tuple4
-
- TUPLE4 - Static variable in class org.apache.crunch.types.TupleFactory
-
- Tuple4.Collect<V1,V2,V3,V4> - Class in org.apache.crunch
-
- Tuple4.Collect(Collection<V1>, Collection<V2>, Collection<V3>, Collection<V4>) - Constructor for class org.apache.crunch.Tuple4.Collect
-
- tupleAggregator(Aggregator<?>...) - Static method in class org.apache.crunch.fn.Aggregators
-
Apply separate aggregators to each component of a
Tuple.
- TupleDeepCopier<T extends Tuple> - Class in org.apache.crunch.types
-
Performs deep copies (based on underlying PType deep copying) of Tuple-based objects.
- TupleDeepCopier(Class<T>, PType...) - Constructor for class org.apache.crunch.types.TupleDeepCopier
-
- TupleFactory<T extends Tuple> - Class in org.apache.crunch.types
-
- TupleFactory() - Constructor for class org.apache.crunch.types.TupleFactory
-
- TupleN - Class in org.apache.crunch
-
A
Tuple instance for an arbitrary number of values.
- TupleN(Object...) - Constructor for class org.apache.crunch.TupleN
-
- TUPLEN - Static variable in class org.apache.crunch.types.TupleFactory
-
- TupleObjectInspector<T extends Tuple> - Class in org.apache.crunch.types.orc
-
An object inspector to define the structure of Crunch Tuples
- TupleObjectInspector(TupleFactory<T>, PType...) - Constructor for class org.apache.crunch.types.orc.TupleObjectInspector
-
- tuples(PType...) - Static method in class org.apache.crunch.types.avro.Avros
-
- tuples(Class<T>, PType...) - Static method in class org.apache.crunch.types.avro.Avros
-
- tuples(PType<?>...) - Method in class org.apache.crunch.types.avro.AvroTypeFamily
-
- tuples(Class<T>, PType<?>...) - Method in class org.apache.crunch.types.avro.AvroTypeFamily
-
- tuples(PType...) - Static method in class org.apache.crunch.types.orc.Orcs
-
Create a tuple-based PType.
- tuples(PType<?>...) - Method in interface org.apache.crunch.types.PTypeFamily
-
- tuples(Class<T>, PType<?>...) - Method in interface org.apache.crunch.types.PTypeFamily
-
- tuples(PType...) - Static method in class org.apache.crunch.types.writable.Writables
-
- tuples(Class<T>, PType...) - Static method in class org.apache.crunch.types.writable.Writables
-
- tuples(PType<?>...) - Method in class org.apache.crunch.types.writable.WritableTypeFamily
-
- tuples(Class<T>, PType<?>...) - Method in class org.apache.crunch.types.writable.WritableTypeFamily
-
- Tuples - Class in org.apache.crunch.util
-
Utilities for working with subclasses of the Tuple interface.
- Tuples() - Constructor for class org.apache.crunch.util.Tuples
-
- Tuples.PairIterable<S,T> - Class in org.apache.crunch.util
-
- Tuples.PairIterable(Iterable<S>, Iterable<T>) - Constructor for class org.apache.crunch.util.Tuples.PairIterable
-
- Tuples.QuadIterable<A,B,C,D> - Class in org.apache.crunch.util
-
- Tuples.QuadIterable(Iterable<A>, Iterable<B>, Iterable<C>, Iterable<D>) - Constructor for class org.apache.crunch.util.Tuples.QuadIterable
-
- Tuples.TripIterable<A,B,C> - Class in org.apache.crunch.util
-
- Tuples.TripIterable(Iterable<A>, Iterable<B>, Iterable<C>) - Constructor for class org.apache.crunch.util.Tuples.TripIterable
-
- Tuples.TupleNIterable - Class in org.apache.crunch.util
-
- Tuples.TupleNIterable(Iterable<?>...) - Constructor for class org.apache.crunch.util.Tuples.TupleNIterable
-
- TupleWritable - Class in org.apache.crunch.types.writable
-
A serialization format for
Tuple.
- TupleWritable() - Constructor for class org.apache.crunch.types.writable.TupleWritable
-
Create an empty tuple with no allocated storage for writables.
- TupleWritable(Writable[]) - Constructor for class org.apache.crunch.types.writable.TupleWritable
-
- TupleWritable(Writable[], int[]) - Constructor for class org.apache.crunch.types.writable.TupleWritable
-
Initialize tuple with storage; unknown whether any of them contain
"written" values.
- TupleWritable.Comparator - Class in org.apache.crunch.types.writable
-
- TupleWritable.Comparator() - Constructor for class org.apache.crunch.types.writable.TupleWritable.Comparator
-
- TupleWritableComparator - Class in org.apache.crunch.lib.sort
-
- TupleWritableComparator() - Constructor for class org.apache.crunch.lib.sort.TupleWritableComparator
-
- typedCollectionOf(PType<T>, T...) - Static method in class org.apache.crunch.impl.mem.MemPipeline
-
- typedCollectionOf(PType<T>, Iterable<T>) - Static method in class org.apache.crunch.impl.mem.MemPipeline
-
- typedTableOf(PTableType<S, T>, S, T, Object...) - Static method in class org.apache.crunch.impl.mem.MemPipeline
-
- typedTableOf(PTableType<S, T>, Iterable<Pair<S, T>>) - Static method in class org.apache.crunch.impl.mem.MemPipeline
-