- AbstractCompositeExtractor<T> - Class in org.apache.crunch.contrib.text
-
Base class for Extractor
instances that delegates the parsing of fields to other
Extractor
instances, primarily used for constructing composite records that implement
the Tuple
interface.
- AbstractCompositeExtractor(TokenizerFactory, List<Extractor<?>>) - Constructor for class org.apache.crunch.contrib.text.AbstractCompositeExtractor
-
- AbstractSimpleExtractor<T> - Class in org.apache.crunch.contrib.text
-
Base class for the common case Extractor
instances that construct a single
object from a block of text stored in a String
, with support for error handling
and reporting.
- accept(T) - Method in class org.apache.crunch.FilterFn
-
If true, emit the given record.
- accept(PCollectionImpl.Visitor) - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
-
- accept(OutputHandler, PType<?>) - Method in interface org.apache.crunch.Target
-
Checks to see if this Target
instance is compatible with the
given PType
.
- ACCEPT_ALL() - Static method in class org.apache.crunch.fn.FilterFns
-
Accept everything.
- addAccumulator(Map<String, Map<String, Long>>, Map<String, Map<String, Long>>) - Method in class org.apache.crunch.impl.spark.CounterAccumulatorParam
-
- addCacheFile(Path, Configuration) - Static method in class org.apache.crunch.util.DistCache
-
- addCompletionHook(CrunchControlledJob.Hook) - Method in class org.apache.crunch.impl.mr.MRPipeline
-
- addInPlace(Map<String, Map<String, Long>>, Map<String, Map<String, Long>>) - Method in class org.apache.crunch.impl.spark.CounterAccumulatorParam
-
- addInputPath(Job, Path, FormatBundle, int) - Static method in class org.apache.crunch.io.CrunchInputs
-
- addInputPaths(Job, Collection<Path>, FormatBundle, int) - Static method in class org.apache.crunch.io.CrunchInputs
-
- addJarDirToDistributedCache(Configuration, File) - Static method in class org.apache.crunch.util.DistCache
-
Adds all jars under the specified directory to the distributed cache of
jobs using the provided configuration.
- addJarDirToDistributedCache(Configuration, String) - Static method in class org.apache.crunch.util.DistCache
-
Adds all jars under the directory at the specified path to the distributed
cache of jobs using the provided configuration.
- addJarToDistributedCache(Configuration, File) - Static method in class org.apache.crunch.util.DistCache
-
Adds the specified jar to the distributed cache of jobs using the provided
configuration.
- addJarToDistributedCache(Configuration, String) - Static method in class org.apache.crunch.util.DistCache
-
Adds the jar at the specified path to the distributed cache of jobs using
the provided configuration.
- addNamedOutput(Job, String, Class<? extends OutputFormat>, Class, Class) - Static method in class org.apache.crunch.io.CrunchOutputs
-
- addNamedOutput(Job, String, FormatBundle<? extends OutputFormat>, Class, Class) - Static method in class org.apache.crunch.io.CrunchOutputs
-
- addPrepareHook(CrunchControlledJob.Hook) - Method in class org.apache.crunch.impl.mr.MRPipeline
-
- age - Variable in class org.apache.crunch.test.Person
-
Deprecated.
- aggregate(Aggregator<S>) - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
-
- Aggregate - Class in org.apache.crunch.lib
-
Methods for performing various types of aggregations over
PCollection
instances.
- Aggregate() - Constructor for class org.apache.crunch.lib.Aggregate
-
- aggregate(PCollection<S>, Aggregator<S>) - Static method in class org.apache.crunch.lib.Aggregate
-
- aggregate(Aggregator<S>) - Method in interface org.apache.crunch.PCollection
-
Returns a PCollection
that contains the result of aggregating all values in this instance.
- Aggregate.PairValueComparator<K,V> - Class in org.apache.crunch.lib
-
- Aggregate.TopKCombineFn<K,V> - Class in org.apache.crunch.lib
-
- Aggregate.TopKFn<K,V> - Class in org.apache.crunch.lib
-
- Aggregator<T> - Interface in org.apache.crunch
-
Aggregate a sequence of values into a possibly smaller sequence of the same type.
- Aggregators - Class in org.apache.crunch.fn
-
- Aggregators.SimpleAggregator<T> - Class in org.apache.crunch.fn
-
Base class for aggregators that do not require any initialization.
- and(FilterFn<S>, FilterFn<S>) - Static method in class org.apache.crunch.fn.FilterFns
-
Accept an entry if all of the given filters accept it, using short-circuit evaluation.
- and(FilterFn<S>...) - Static method in class org.apache.crunch.fn.FilterFns
-
Accept an entry if all of the given filters accept it, using short-circuit evaluation.
- apply(Statement, Description) - Method in class org.apache.crunch.test.TemporaryPath
-
- applyPTypeTransforms() - Method in interface org.apache.crunch.types.Converter
-
If true, convert the inputs or outputs from this Converter
instance
before (for outputs) or after (for inputs) using the associated PType#getInputMapFn
and PType#getOutputMapFn calls.
- as(PType<T>) - Method in class org.apache.crunch.types.avro.AvroTypeFamily
-
- as(PType<T>) - Method in interface org.apache.crunch.types.PTypeFamily
-
Returns the equivalent of the given ptype for this family, if it exists.
- as(PType<T>) - Method in class org.apache.crunch.types.writable.WritableTypeFamily
-
- asCollection() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
- asCollection() - Method in interface org.apache.crunch.PCollection
-
- asMap() - Method in class org.apache.crunch.impl.dist.collect.PTableBase
-
Returns a
PObject
encapsulating a
Map
made up of the keys and values in this
PTable
.
- asMap() - Method in interface org.apache.crunch.PTable
-
Returns a
PObject
encapsulating a
Map
made up of the keys and values in this
PTable
.
- asPTable(PCollection<Pair<K, V>>) - Static method in class org.apache.crunch.lib.PTables
-
Convert the given PCollection<Pair<K, V>>
to a PTable<K, V>
.
- asReadable(boolean) - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
-
- asReadable() - Method in interface org.apache.crunch.io.ReadableSource
-
- asReadable(boolean) - Method in interface org.apache.crunch.PCollection
-
- asSourceTarget(PType<T>) - Method in interface org.apache.crunch.Target
-
Attempt to create the SourceTarget
type that corresponds to this Target
for the given PType
, if possible.
- At - Class in org.apache.crunch.io
-
Static factory methods for creating common
SourceTarget
types, which may be treated as both a
Source
and a
Target
.
- At() - Constructor for class org.apache.crunch.io.At
-
- Average - Class in org.apache.crunch.lib
-
- Average() - Constructor for class org.apache.crunch.lib.Average
-
- AverageBytesByIP - Class in org.apache.crunch.examples
-
- AverageBytesByIP() - Constructor for class org.apache.crunch.examples.AverageBytesByIP
-
- AVRO_MODE_PROPERTY - Static variable in class org.apache.crunch.types.avro.AvroMode
-
- AVRO_SHUFFLE_MODE_PROPERTY - Static variable in class org.apache.crunch.types.avro.AvroMode
-
- AvroDerivedValueDeepCopier<T,S> - Class in org.apache.crunch.types.avro
-
A DeepCopier specific to Avro derived types.
- AvroDerivedValueDeepCopier(MapFn<T, S>, MapFn<S, T>, AvroType<S>) - Constructor for class org.apache.crunch.types.avro.AvroDerivedValueDeepCopier
-
- avroFile(String, Class<T>) - Static method in class org.apache.crunch.io.At
-
Creates a SourceTarget<T>
instance from the Avro file(s) at the given path name.
- avroFile(Path, Class<T>) - Static method in class org.apache.crunch.io.At
-
Creates a SourceTarget<T>
instance from the Avro file(s) at the given Path
.
- avroFile(String) - Static method in class org.apache.crunch.io.At
-
Creates a SourceTarget<GenericData.Record>
by reading the schema of the Avro file
at the given path.
- avroFile(Path) - Static method in class org.apache.crunch.io.At
-
Creates a SourceTarget<GenericData.Record>
by reading the schema of the Avro file
at the given path.
- avroFile(Path, Configuration) - Static method in class org.apache.crunch.io.At
-
Creates a SourceTarget<GenericData.Record>
by reading the schema of the Avro file
at the given path using the FileSystem
information contained in the given
Configuration
instance.
- avroFile(String, PType<T>) - Static method in class org.apache.crunch.io.At
-
Creates a SourceTarget<T>
instance from the Avro file(s) at the given path name.
- avroFile(Path, PType<T>) - Static method in class org.apache.crunch.io.At
-
Creates a SourceTarget<T>
instance from the Avro file(s) at the given Path
.
- avroFile(String, Class<T>) - Static method in class org.apache.crunch.io.From
-
Creates a Source<T>
instance from the Avro file(s) at the given path name.
- avroFile(Path, Class<T>) - Static method in class org.apache.crunch.io.From
-
Creates a Source<T>
instance from the Avro file(s) at the given Path
.
- avroFile(List<Path>, Class<T>) - Static method in class org.apache.crunch.io.From
-
Creates a Source<T>
instance from the Avro file(s) at the given Path
s.
- avroFile(String, PType<T>) - Static method in class org.apache.crunch.io.From
-
Creates a Source<T>
instance from the Avro file(s) at the given path name.
- avroFile(Path, PType<T>) - Static method in class org.apache.crunch.io.From
-
Creates a Source<T>
instance from the Avro file(s) at the given Path
.
- avroFile(List<Path>, PType<T>) - Static method in class org.apache.crunch.io.From
-
Creates a Source<T>
instance from the Avro file(s) at the given Path
s.
- avroFile(String) - Static method in class org.apache.crunch.io.From
-
Creates a Source<GenericData.Record>
by reading the schema of the Avro file
at the given path.
- avroFile(Path) - Static method in class org.apache.crunch.io.From
-
Creates a Source<GenericData.Record>
by reading the schema of the Avro file
at the given path.
- avroFile(List<Path>) - Static method in class org.apache.crunch.io.From
-
Creates a Source<GenericData.Record>
by reading the schema of the Avro file
at the given paths.
- avroFile(Path, Configuration) - Static method in class org.apache.crunch.io.From
-
Creates a Source<GenericData.Record>
by reading the schema of the Avro file
at the given path using the FileSystem
information contained in the given
Configuration
instance.
- avroFile(List<Path>, Configuration) - Static method in class org.apache.crunch.io.From
-
Creates a Source<GenericData.Record>
by reading the schema of the Avro file
at the given paths using the FileSystem
information contained in the given
Configuration
instance.
- avroFile(String) - Static method in class org.apache.crunch.io.To
-
Creates a Target
at the given path name that writes data to
Avro files.
- avroFile(Path) - Static method in class org.apache.crunch.io.To
-
Creates a Target
at the given Path
that writes data to
Avro files.
- AvroGenericFn(int[], Schema) - Constructor for class org.apache.crunch.lib.sort.SortFns.AvroGenericFn
-
- AvroIndexedRecordPartitioner() - Constructor for class org.apache.crunch.lib.join.JoinUtils.AvroIndexedRecordPartitioner
-
- AvroInputFormat<T> - Class in org.apache.crunch.types.avro
-
An InputFormat
for Avro data files.
- AvroInputFormat() - Constructor for class org.apache.crunch.types.avro.AvroInputFormat
-
- AvroMode - Class in org.apache.crunch.types.avro
-
AvroMode is an immutable object used for configuring the reading and writing of Avro types.
- AvroMode.ModeType - Enum in org.apache.crunch.types.avro
-
Internal enum which represents the various Avro data types.
- AvroOutputFormat<T> - Class in org.apache.crunch.types.avro
-
An OutputFormat
for Avro data files.
- AvroOutputFormat() - Constructor for class org.apache.crunch.types.avro.AvroOutputFormat
-
- AvroPairGroupingComparator() - Constructor for class org.apache.crunch.lib.join.JoinUtils.AvroPairGroupingComparator
-
- AvroPathPerKeyOutputFormat<T> - Class in org.apache.crunch.types.avro
-
A FileOutputFormat
that takes in a Utf8
and an Avro record and writes the Avro records to
a sub-directory of the output path whose name is equal to the string-form of the Utf8
.
- AvroPathPerKeyOutputFormat() - Constructor for class org.apache.crunch.types.avro.AvroPathPerKeyOutputFormat
-
- Avros - Class in org.apache.crunch.types.avro
-
Defines static methods that are analogous to the methods defined in
AvroTypeFamily
for convenient static importing.
- AvroSerDe<T> - Class in org.apache.crunch.impl.spark.serde
-
- AvroSerDe(AvroType<T>, Map<String, String>) - Constructor for class org.apache.crunch.impl.spark.serde.AvroSerDe
-
- avroTableFile(Path, PTableType<K, V>) - Static method in class org.apache.crunch.io.From
-
Creates a TableSource<K,V>
for reading an Avro key/value file at the given path.
- avroTableFile(List<Path>, PTableType<K, V>) - Static method in class org.apache.crunch.io.From
-
Creates a TableSource<K,V>
for reading an Avro key/value file at the given paths.
- AvroTextOutputFormat<K,V> - Class in org.apache.crunch.types.avro
-
- AvroTextOutputFormat() - Constructor for class org.apache.crunch.types.avro.AvroTextOutputFormat
-
- AvroType<T> - Class in org.apache.crunch.types.avro
-
The implementation of the PType interface for Avro-based serialization.
- AvroType(Class<T>, Schema, DeepCopier<T>, PType...) - Constructor for class org.apache.crunch.types.avro.AvroType
-
- AvroType(Class<T>, Schema, MapFn, MapFn, DeepCopier<T>, AvroType.AvroRecordType, PType...) - Constructor for class org.apache.crunch.types.avro.AvroType
-
- AvroType.AvroRecordType - Enum in org.apache.crunch.types.avro
-
- AvroTypeFamily - Class in org.apache.crunch.types.avro
-
- AvroUtf8InputFormat - Class in org.apache.crunch.types.avro
-
An InputFormat
for text files.
- AvroUtf8InputFormat() - Constructor for class org.apache.crunch.types.avro.AvroUtf8InputFormat
-
- cache() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
-
- cache(CachingOptions) - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
-
- cache() - Method in class org.apache.crunch.impl.dist.collect.PTableBase
-
- cache(CachingOptions) - Method in class org.apache.crunch.impl.dist.collect.PTableBase
-
- cache(PCollection<T>, CachingOptions) - Method in class org.apache.crunch.impl.mem.MemPipeline
-
- cache(PCollection<T>, CachingOptions) - Method in class org.apache.crunch.impl.mr.MRPipeline
-
- cache(PCollection<T>, CachingOptions) - Method in class org.apache.crunch.impl.spark.SparkPipeline
-
- cache() - Method in interface org.apache.crunch.lambda.LCollection
-
- cache(CachingOptions) - Method in interface org.apache.crunch.lambda.LCollection
-
- cache() - Method in interface org.apache.crunch.PCollection
-
- cache(CachingOptions) - Method in interface org.apache.crunch.PCollection
-
Marks this data as cached using the given CachingOptions
.
- cache(PCollection<T>, CachingOptions) - Method in interface org.apache.crunch.Pipeline
-
Caches the given PCollection so that it will be processed at most once
during pipeline execution.
- cache() - Method in interface org.apache.crunch.PTable
-
- cache(CachingOptions) - Method in interface org.apache.crunch.PTable
-
- CachingOptions - Class in org.apache.crunch
-
Options for controlling how a PCollection<T>
is cached for subsequent processing.
- CachingOptions.Builder - Class in org.apache.crunch
-
A Builder class to use for setting the
CachingOptions
for a
PCollection
.
- call(Tuple2<IntByteArray, List<byte[]>>) - Method in class org.apache.crunch.impl.spark.collect.ToByteArrayFunction
-
- call(Iterator<Tuple2<K, V>>) - Method in class org.apache.crunch.impl.spark.fn.CombineMapsideFunction
-
- call(Iterator<Pair<K, V>>) - Method in class org.apache.crunch.impl.spark.fn.CrunchPairTuple2
-
- call(Integer, Iterator) - Method in class org.apache.crunch.impl.spark.fn.FlatMapIndexFn
-
- call(Iterator<Tuple2<K, V>>) - Method in class org.apache.crunch.impl.spark.fn.FlatMapPairDoFn
-
- call(Tuple2<K, V>) - Method in class org.apache.crunch.impl.spark.fn.InputConverterFunction
-
- call(Object) - Method in class org.apache.crunch.impl.spark.fn.MapFunction
-
- call(Pair<K, V>) - Method in class org.apache.crunch.impl.spark.fn.MapOutputFunction
-
- call(S) - Method in class org.apache.crunch.impl.spark.fn.OutputConverterFunction
-
- call(Iterator<T>) - Method in class org.apache.crunch.impl.spark.fn.PairFlatMapDoFn
-
- call(Tuple2<K, V>) - Method in class org.apache.crunch.impl.spark.fn.PairMapFunction
-
- call(Pair<K, List<V>>) - Method in class org.apache.crunch.impl.spark.fn.PairMapIterableFunction
-
- call(Pair<K, V>) - Method in class org.apache.crunch.impl.spark.fn.PartitionedMapOutputFunction
-
- call(Iterator<Tuple2<ByteArray, List<byte[]>>>) - Method in class org.apache.crunch.impl.spark.fn.ReduceGroupingFunction
-
- call(Tuple2<ByteArray, Iterable<byte[]>>) - Method in class org.apache.crunch.impl.spark.fn.ReduceInputFunction
-
- call(Pair<K, V>) - Method in class org.apache.crunch.impl.spark.fn.Tuple2MapFunction
-
- CAN_COMBINE_SPECIFIC_AND_REFLECT_SCHEMAS - Static variable in class org.apache.crunch.types.avro.Avros
-
Older versions of Avro (i.e., before 1.7.0) do not support schemas that are
composed of a mix of specific and reflection-based schemas.
- Cartesian - Class in org.apache.crunch.lib
-
Utilities for Cartesian products of two PTable
or PCollection
instances.
- Cartesian() - Constructor for class org.apache.crunch.lib.Cartesian
-
- Channels - Class in org.apache.crunch.lib
-
- Channels() - Constructor for class org.apache.crunch.lib.Channels
-
- checkCombiningSpecificAndReflectionSchemas() - Static method in class org.apache.crunch.types.avro.Avros
-
- checkOutputSpecs(JobContext) - Static method in class org.apache.crunch.io.CrunchOutputs
-
- ClassloaderFallbackObjectInputStream - Class in org.apache.crunch.util
-
- ClassloaderFallbackObjectInputStream(InputStream) - Constructor for class org.apache.crunch.util.ClassloaderFallbackObjectInputStream
-
- cleanup(Emitter<Pair<String, BloomFilter>>) - Method in class org.apache.crunch.contrib.bloomfilter.BloomFilterFn
-
- cleanup(Emitter<T>) - Method in class org.apache.crunch.DoFn
-
Called during the cleanup of the MapReduce job this DoFn
is
associated with.
- cleanup(Emitter<T>) - Method in class org.apache.crunch.FilterFn
-
- cleanup() - Method in class org.apache.crunch.FilterFn
-
Called during the cleanup of the MapReduce job this FilterFn
is
associated with.
- cleanup(Emitter<T>) - Method in class org.apache.crunch.fn.CompositeMapFn
-
- cleanup(Emitter<Pair<S, T>>) - Method in class org.apache.crunch.fn.PairMapFn
-
- cleanup(boolean) - Method in class org.apache.crunch.impl.dist.DistributedPipeline
-
- cleanup(boolean) - Method in class org.apache.crunch.impl.mem.MemPipeline
-
- cleanup(Emitter<Pair<Integer, Pair<K, V>>>) - Method in class org.apache.crunch.lib.Aggregate.TopKFn
-
- cleanup(Emitter<Pair<K, Pair<U, V>>>) - Method in class org.apache.crunch.lib.join.FullOuterJoinFn
-
Called during the cleanup of the MapReduce job this DoFn
is
associated with.
- cleanup(Emitter<Pair<K, Pair<U, V>>>) - Method in class org.apache.crunch.lib.join.LeftOuterJoinFn
-
Called during the cleanup of the MapReduce job this DoFn
is
associated with.
- cleanup(boolean) - Method in interface org.apache.crunch.Pipeline
-
Cleans up any artifacts created as a result of
running
the pipeline.
- clear() - Method in class org.apache.crunch.types.writable.TupleWritable
-
- clearAge() - Method in class org.apache.crunch.test.Person.Builder
-
Clears the value of the 'age' field
- clearCounters() - Static method in class org.apache.crunch.impl.mem.MemPipeline
-
- clearCounters() - Static method in class org.apache.crunch.test.TestCounters
-
- clearDepartment() - Method in class org.apache.crunch.test.Employee.Builder
-
Clears the value of the 'department' field
- clearName() - Method in class org.apache.crunch.test.Employee.Builder
-
Clears the value of the 'name' field
- clearName() - Method in class org.apache.crunch.test.Person.Builder
-
Clears the value of the 'name' field
- clearSalary() - Method in class org.apache.crunch.test.Employee.Builder
-
Clears the value of the 'salary' field
- clearSiblingnames() - Method in class org.apache.crunch.test.Person.Builder
-
Clears the value of the 'siblingnames' field
- close() - Method in class org.apache.crunch.io.CrunchOutputs
-
- cogroup(PTable<K, U>) - Method in class org.apache.crunch.impl.dist.collect.PTableBase
-
- cogroup(LTable<K, U>) - Method in interface org.apache.crunch.lambda.LTable
-
Cogroup this table with another
LTable
with the same key type, collecting the set of values from
each side.
- Cogroup - Class in org.apache.crunch.lib
-
- Cogroup() - Constructor for class org.apache.crunch.lib.Cogroup
-
- cogroup(PTable<K, U>, PTable<K, V>) - Static method in class org.apache.crunch.lib.Cogroup
-
Co-groups the two
PTable
arguments.
- cogroup(int, PTable<K, U>, PTable<K, V>) - Static method in class org.apache.crunch.lib.Cogroup
-
Co-groups the two
PTable
arguments with a user-specified degree of parallelism (a.k.a, number of
reducers.)
- cogroup(PTable<K, V1>, PTable<K, V2>, PTable<K, V3>) - Static method in class org.apache.crunch.lib.Cogroup
-
Co-groups the three
PTable
arguments.
- cogroup(int, PTable<K, V1>, PTable<K, V2>, PTable<K, V3>) - Static method in class org.apache.crunch.lib.Cogroup
-
Co-groups the three
PTable
arguments with a user-specified degree of parallelism (a.k.a, number of
reducers.)
- cogroup(PTable<K, V1>, PTable<K, V2>, PTable<K, V3>, PTable<K, V4>) - Static method in class org.apache.crunch.lib.Cogroup
-
Co-groups the three
PTable
arguments.
- cogroup(int, PTable<K, V1>, PTable<K, V2>, PTable<K, V3>, PTable<K, V4>) - Static method in class org.apache.crunch.lib.Cogroup
-
Co-groups the three
PTable
arguments with a user-specified degree of parallelism (a.k.a, number of
reducers.)
- cogroup(PTable<K, ?>, PTable<K, ?>...) - Static method in class org.apache.crunch.lib.Cogroup
-
Co-groups an arbitrary number of
PTable
arguments.
- cogroup(int, PTable<K, ?>, PTable<K, ?>...) - Static method in class org.apache.crunch.lib.Cogroup
-
Co-groups an arbitrary number of
PTable
arguments with a user-specified degree of parallelism
(a.k.a, number of reducers.) The largest table should come last in the ordering.
- cogroup(PTable<K, U>) - Method in interface org.apache.crunch.PTable
-
Co-group operation with the given table on common keys.
- Collect(Collection<V1>, Collection<V2>, Collection<V3>) - Constructor for class org.apache.crunch.Tuple3.Collect
-
- Collect(Collection<V1>, Collection<V2>, Collection<V3>, Collection<V4>) - Constructor for class org.apache.crunch.Tuple4.Collect
-
- collectAllValues() - Method in interface org.apache.crunch.lambda.LGroupedTable
-
- CollectionDeepCopier<T> - Class in org.apache.crunch.types
-
Performs deep copies (based on underlying PType deep copying) of Collections.
- CollectionDeepCopier(PType<T>) - Constructor for class org.apache.crunch.types.CollectionDeepCopier
-
- collectionOf(T...) - Static method in class org.apache.crunch.impl.mem.MemPipeline
-
- collectionOf(Iterable<T>) - Static method in class org.apache.crunch.impl.mem.MemPipeline
-
- collections(PType<T>) - Static method in class org.apache.crunch.types.avro.Avros
-
- collections(PType<T>) - Method in class org.apache.crunch.types.avro.AvroTypeFamily
-
- collections(PType<T>) - Method in interface org.apache.crunch.types.PTypeFamily
-
- collections(PType<T>) - Static method in class org.apache.crunch.types.writable.Writables
-
- collections(PType<T>) - Method in class org.apache.crunch.types.writable.WritableTypeFamily
-
- collectUniqueValues() - Method in interface org.apache.crunch.lambda.LGroupedTable
-
Collect all unique values for each key into a
Collection
(note that the value type must have a correctly-
defined equals() and hashcode().
- collectValues() - Method in class org.apache.crunch.impl.dist.collect.PTableBase
-
- collectValues(SSupplier<C>, SBiConsumer<C, V>, PType<C>) - Method in interface org.apache.crunch.lambda.LGroupedTable
-
Collect the values into an aggregate type.
- collectValues(PTable<K, V>) - Static method in class org.apache.crunch.lib.Aggregate
-
- collectValues() - Method in interface org.apache.crunch.PTable
-
Aggregate all of the values with the same key into a single key-value pair
in the returned PTable.
- column() - Method in class org.apache.crunch.lib.Sort.ColumnOrder
-
- ColumnOrder(int, Sort.Order) - Constructor for class org.apache.crunch.lib.Sort.ColumnOrder
-
- CombineFn<S,T> - Class in org.apache.crunch
-
A special
DoFn
implementation that converts an
Iterable
of
values into a single value.
- CombineFn() - Constructor for class org.apache.crunch.CombineFn
-
- CombineMapsideFunction<K,V> - Class in org.apache.crunch.impl.spark.fn
-
- CombineMapsideFunction(CombineFn<K, V>, SparkRuntimeContext) - Constructor for class org.apache.crunch.impl.spark.fn.CombineMapsideFunction
-
- combineValues(CombineFn<K, V>, CombineFn<K, V>) - Method in class org.apache.crunch.impl.dist.collect.BaseGroupedTable
-
- combineValues(CombineFn<K, V>) - Method in class org.apache.crunch.impl.dist.collect.BaseGroupedTable
-
- combineValues(Aggregator<V>) - Method in class org.apache.crunch.impl.dist.collect.BaseGroupedTable
-
- combineValues(Aggregator<V>, Aggregator<V>) - Method in class org.apache.crunch.impl.dist.collect.BaseGroupedTable
-
- combineValues(Aggregator<V>) - Method in interface org.apache.crunch.lambda.LGroupedTable
-
Combine the value part of the table using the provided Crunch
Aggregator
.
- combineValues(SSupplier<A>, SBiFunction<A, V, A>, SFunction<A, Iterable<V>>) - Method in interface org.apache.crunch.lambda.LGroupedTable
-
Combine the value part of the table using the given functions.
- combineValues(CombineFn<K, V>) - Method in interface org.apache.crunch.PGroupedTable
-
Combines the values of this grouping using the given CombineFn
.
- combineValues(CombineFn<K, V>, CombineFn<K, V>) - Method in interface org.apache.crunch.PGroupedTable
-
Combines and reduces the values of this grouping using the given CombineFn
instances.
- combineValues(Aggregator<V>) - Method in interface org.apache.crunch.PGroupedTable
-
Combine the values in each group using the given
Aggregator
.
- combineValues(Aggregator<V>, Aggregator<V>) - Method in interface org.apache.crunch.PGroupedTable
-
Combine and reduces the values in each group using the given
Aggregator
instances.
- comm(PCollection<T>, PCollection<T>) - Static method in class org.apache.crunch.lib.Set
-
Find the elements that are common to two sets, like the Unix
comm
utility.
- Comparator() - Constructor for class org.apache.crunch.types.writable.TupleWritable.Comparator
-
- compare(ByteArray, ByteArray) - Method in class org.apache.crunch.impl.spark.SparkComparator
-
- compare(Pair<K, V>, Pair<K, V>) - Method in class org.apache.crunch.lib.Aggregate.PairValueComparator
-
- compare(AvroWrapper<T>, AvroWrapper<T>) - Method in class org.apache.crunch.lib.join.JoinUtils.AvroPairGroupingComparator
-
- compare(byte[], int, int, byte[], int, int) - Method in class org.apache.crunch.lib.join.JoinUtils.AvroPairGroupingComparator
-
- compare(TupleWritable, TupleWritable) - Method in class org.apache.crunch.lib.join.JoinUtils.TupleWritableComparator
-
- compare(byte[], int, int, byte[], int, int) - Method in class org.apache.crunch.lib.join.JoinUtils.TupleWritableComparator
-
- compare(AvroKey<T>, AvroKey<T>) - Method in class org.apache.crunch.lib.sort.ReverseAvroComparator
-
- compare(byte[], int, int, byte[], int, int) - Method in class org.apache.crunch.lib.sort.ReverseAvroComparator
-
- compare(byte[], int, int, byte[], int, int) - Method in class org.apache.crunch.lib.sort.ReverseWritableComparator
-
- compare(T, T) - Method in class org.apache.crunch.lib.sort.ReverseWritableComparator
-
- compare(WritableComparable, WritableComparable) - Method in class org.apache.crunch.lib.sort.TupleWritableComparator
-
- compare(byte[], int, int, byte[], int, int) - Method in class org.apache.crunch.types.writable.TupleWritable.Comparator
-
- compare(WritableComparable, WritableComparable) - Method in class org.apache.crunch.types.writable.TupleWritable.Comparator
-
- compareTo(ByteArray) - Method in class org.apache.crunch.impl.spark.ByteArray
-
- compareTo(Pair<K, V>) - Method in class org.apache.crunch.Pair
-
- compareTo(TupleWritable) - Method in class org.apache.crunch.types.writable.TupleWritable
-
- compareTo(UnionWritable) - Method in class org.apache.crunch.types.writable.UnionWritable
-
- CompositeMapFn<R,S,T> - Class in org.apache.crunch.fn
-
- CompositeMapFn(MapFn<R, S>, MapFn<S, T>) - Constructor for class org.apache.crunch.fn.CompositeMapFn
-
- CompositePathIterable<T> - Class in org.apache.crunch.io
-
- Compress - Class in org.apache.crunch.io
-
Helper functions for compressing output data.
- Compress() - Constructor for class org.apache.crunch.io.Compress
-
- compress(T, Class<? extends CompressionCodec>) - Static method in class org.apache.crunch.io.Compress
-
Configure the given output target to be compressed using the given codec.
- conf(String, String) - Method in class org.apache.crunch.GroupingOptions.Builder
-
- conf(String, String) - Method in class org.apache.crunch.ParallelDoOptions.Builder
-
Specifies key-value pairs that should be added to the Configuration
object associated with the
Job
that includes these options.
- conf(String, String) - Method in interface org.apache.crunch.SourceTarget
-
Adds the given key-value pair to the Configuration
instance(s) that are used to
read and write this SourceTarget<T>
.
- configure(Configuration) - Method in class org.apache.crunch.DoFn
-
Configure this DoFn.
- configure(Configuration) - Method in class org.apache.crunch.fn.CompositeMapFn
-
- configure(Configuration) - Method in class org.apache.crunch.fn.ExtractKeyFn
-
- configure(Configuration) - Method in class org.apache.crunch.fn.PairMapFn
-
- configure(Job) - Method in class org.apache.crunch.GroupingOptions
-
- configure(Configuration) - Method in class org.apache.crunch.io.FormatBundle
-
- configure(Target, PType<?>) - Method in interface org.apache.crunch.io.OutputHandler
-
- configure(Configuration) - Method in class org.apache.crunch.ParallelDoOptions
-
Applies the key-value pairs that were associated with this instance to the given Configuration
object.
- configure(Configuration) - Method in interface org.apache.crunch.ReadableData
-
Allows this instance to specify any additional configuration settings that may
be needed by the job that it is launched in.
- configure(FormatBundle) - Method in class org.apache.crunch.types.avro.AvroMode
-
Populates the
bundle
with mode specific settings for the specific
FormatBundle
.
- configure(Configuration) - Method in class org.apache.crunch.types.avro.AvroMode
-
Populates the conf
with mode specific settings.
- configure(Configuration) - Method in class org.apache.crunch.types.avro.AvroUtf8InputFormat
-
- configure(Configuration) - Method in class org.apache.crunch.types.PGroupedTableType.PairIterableMapFn
-
- configure(Configuration) - Method in class org.apache.crunch.util.DelegatingReadableData
-
- configure(Configuration) - Method in class org.apache.crunch.util.UnionReadableData
-
- configureFactory(Configuration) - Method in class org.apache.crunch.types.avro.AvroMode
-
- configureForMapReduce(Job, PType<?>, Path, String) - Method in interface org.apache.crunch.io.MapReduceTarget
-
- configureOrdering(Configuration, WritableType[], Sort.ColumnOrder[]) - Static method in class org.apache.crunch.lib.sort.TupleWritableComparator
-
- configureReflectDataFactory(Configuration) - Static method in class org.apache.crunch.types.avro.Avros
-
- configureShuffle(Configuration) - Method in class org.apache.crunch.types.avro.AvroMode
-
Populates the conf
with mode specific settings for use during the shuffle phase.
- configureShuffle(Job, GroupingOptions) - Method in class org.apache.crunch.types.PGroupedTableType
-
- configureSource(Job, int) - Method in interface org.apache.crunch.Source
-
Configure the given job to use this source as an input.
- containers(Class<T>) - Static method in class org.apache.crunch.types.avro.Avros
-
- containers(Class<T>) - Method in class org.apache.crunch.types.avro.AvroTypeFamily
-
- convert(Object, ObjectInspector, ObjectInspector) - Static method in class org.apache.crunch.types.orc.OrcUtils
-
Convert an object from / to OrcStruct
- convert(PType<T>, PTypeFamily) - Static method in class org.apache.crunch.types.PTypeUtils
-
- Converter<K,V,S,T> - Interface in org.apache.crunch.types
-
Converts the input key/value from a MapReduce task into the input to a
DoFn
, or takes the output of a
DoFn
and write it to the
output key/values.
- convertInput(K, V) - Method in interface org.apache.crunch.types.Converter
-
- convertIterableInput(K, Iterable<V>) - Method in interface org.apache.crunch.types.Converter
-
- copyResourceFile(String) - Method in class org.apache.crunch.test.TemporaryPath
-
Copy a classpath resource to
File
.
- copyResourceFileName(String) - Method in class org.apache.crunch.test.TemporaryPath
-
Copy a classpath resource returning its absolute file name.
- copyResourcePath(String) - Method in class org.apache.crunch.test.TemporaryPath
-
Copy a classpath resource to a Path
.
- count() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
-
- count() - Method in interface org.apache.crunch.lambda.LCollection
-
Count distict values in this LCollection, yielding an
LTable
mapping each value to the number
of occurrences in the collection.
- count(PCollection<S>) - Static method in class org.apache.crunch.lib.Aggregate
-
Returns a PTable
that contains the unique elements of this collection mapped to a count
of their occurrences.
- count(PCollection<S>, int) - Static method in class org.apache.crunch.lib.Aggregate
-
Returns a PTable
that contains the unique elements of this collection mapped to a count
of their occurrences.
- count - Variable in class org.apache.crunch.lib.Quantiles.Result
-
- count() - Method in interface org.apache.crunch.PCollection
-
Returns a PTable
instance that contains the counts of each unique
element of this PCollection.
- countClause - Variable in class org.apache.crunch.contrib.io.jdbc.DataBaseSource.Builder
-
- CounterAccumulatorParam - Class in org.apache.crunch.impl.spark
-
- CounterAccumulatorParam() - Constructor for class org.apache.crunch.impl.spark.CounterAccumulatorParam
-
- create(String) - Method in class org.apache.crunch.contrib.text.TokenizerFactory
-
Return a Scanner
instance that wraps the input string and uses the delimiter,
skip, and locale settings for this TokenizerFactory
instance.
- create(Iterable<S>, PType<S>) - Method in class org.apache.crunch.impl.dist.DistributedPipeline
-
- create(Iterable<S>, PType<S>, CreateOptions) - Method in class org.apache.crunch.impl.dist.DistributedPipeline
-
- create(Iterable<Pair<K, V>>, PTableType<K, V>) - Method in class org.apache.crunch.impl.dist.DistributedPipeline
-
- create(Iterable<Pair<K, V>>, PTableType<K, V>, CreateOptions) - Method in class org.apache.crunch.impl.dist.DistributedPipeline
-
- create(Iterable<T>, PType<T>) - Method in class org.apache.crunch.impl.mem.MemPipeline
-
- create(Iterable<T>, PType<T>, CreateOptions) - Method in class org.apache.crunch.impl.mem.MemPipeline
-
- create(Iterable<Pair<K, V>>, PTableType<K, V>) - Method in class org.apache.crunch.impl.mem.MemPipeline
-
- create(Iterable<Pair<K, V>>, PTableType<K, V>, CreateOptions) - Method in class org.apache.crunch.impl.mem.MemPipeline
-
- create(PType<?>, Configuration) - Static method in class org.apache.crunch.impl.spark.serde.SerDeFactory
-
- create(Iterable<S>, PType<S>, CreateOptions) - Method in class org.apache.crunch.impl.spark.SparkPipeline
-
- create(Iterable<Pair<K, V>>, PTableType<K, V>, CreateOptions) - Method in class org.apache.crunch.impl.spark.SparkPipeline
-
- create(FileSystem, Path, FileReaderFactory<S>) - Static method in class org.apache.crunch.io.CompositePathIterable
-
- create() - Static method in class org.apache.crunch.lib.join.MapsideJoinStrategy
-
Create a new MapsideJoinStrategy
instance that will load its left-side table into memory,
and will materialize the contents of the left-side table to disk before running the in-memory join.
- create(boolean) - Static method in class org.apache.crunch.lib.join.MapsideJoinStrategy
-
Create a new MapsideJoinStrategy
instance that will load its left-side table into memory.
- create(Iterable<T>, PType<T>) - Method in interface org.apache.crunch.Pipeline
-
Creates a PCollection
containing the values found in the given Iterable
using an implementation-specific distribution mechanism.
- create(Iterable<T>, PType<T>, CreateOptions) - Method in interface org.apache.crunch.Pipeline
-
Creates a PCollection
containing the values found in the given Iterable
using an implementation-specific distribution mechanism.
- create(Iterable<Pair<K, V>>, PTableType<K, V>) - Method in interface org.apache.crunch.Pipeline
-
Creates a PTable
containing the values found in the given Iterable
using an implementation-specific distribution mechanism.
- create(Iterable<Pair<K, V>>, PTableType<K, V>, CreateOptions) - Method in interface org.apache.crunch.Pipeline
-
Creates a PTable
containing the values found in the given Iterable
using an implementation-specific distribution mechanism.
- create() - Method in class org.apache.crunch.test.TemporaryPath
-
- create() - Static method in class org.apache.crunch.types.NoOpDeepCopier
-
Static factory method.
- create(Object...) - Method in class org.apache.crunch.types.orc.TupleObjectInspector
-
- create(Class<T>, Class...) - Static method in class org.apache.crunch.types.TupleFactory
-
- createBinarySerde(TypeInfo) - Static method in class org.apache.crunch.types.orc.OrcUtils
-
Create a binary serde for OrcStruct serialization/deserialization
- CreatedCollection<T> - Class in org.apache.crunch.impl.spark.collect
-
Represents a Spark-based PCollection that was created from a Java Iterable
of
values.
- CreatedCollection(SparkPipeline, Iterable<T>, PType<T>, CreateOptions) - Constructor for class org.apache.crunch.impl.spark.collect.CreatedCollection
-
- createDoCollection(String, PCollectionImpl<S>, DoFn<S, T>, PType<T>, ParallelDoOptions) - Method in interface org.apache.crunch.impl.dist.collect.PCollectionFactory
-
- createDoCollection(String, PCollectionImpl<S>, DoFn<S, T>, PType<T>, ParallelDoOptions) - Method in class org.apache.crunch.impl.spark.collect.SparkCollectFactory
-
- createDoNode() - Method in interface org.apache.crunch.impl.dist.collect.MRCollection
-
- createDoTable(String, PCollectionImpl<S>, DoFn<S, Pair<K, V>>, PTableType<K, V>, ParallelDoOptions) - Method in interface org.apache.crunch.impl.dist.collect.PCollectionFactory
-
- createDoTable(String, PCollectionImpl<S>, CombineFn<K, V>, DoFn<S, Pair<K, V>>, PTableType<K, V>) - Method in interface org.apache.crunch.impl.dist.collect.PCollectionFactory
-
- createDoTable(String, PCollectionImpl<S>, DoFn<S, Pair<K, V>>, PTableType<K, V>, ParallelDoOptions) - Method in class org.apache.crunch.impl.spark.collect.SparkCollectFactory
-
- createDoTable(String, PCollectionImpl<S>, CombineFn<K, V>, DoFn<S, Pair<K, V>>, PTableType<K, V>) - Method in class org.apache.crunch.impl.spark.collect.SparkCollectFactory
-
- CreatedTable<K,V> - Class in org.apache.crunch.impl.spark.collect
-
Represents a Spark-based PTable that was created from a Java Iterable
of
key-value pairs.
- CreatedTable(SparkPipeline, Iterable<Pair<K, V>>, PTableType<K, V>, CreateOptions) - Constructor for class org.apache.crunch.impl.spark.collect.CreatedTable
-
- createFilter(Path, BloomFilterFn<String>) - Static method in class org.apache.crunch.contrib.bloomfilter.BloomFilterFactory
-
The method will take an input path and generates BloomFilters for all text
files in that path.
- createFilter(PCollection<T>, BloomFilterFn<T>) - Static method in class org.apache.crunch.contrib.bloomfilter.BloomFilterFactory
-
- createGroupedTable(PTableBase<K, V>, GroupingOptions) - Method in interface org.apache.crunch.impl.dist.collect.PCollectionFactory
-
- createGroupedTable(PTableBase<K, V>, GroupingOptions) - Method in class org.apache.crunch.impl.spark.collect.SparkCollectFactory
-
- createInputCollection(Source<S>, String, DistributedPipeline, ParallelDoOptions) - Method in interface org.apache.crunch.impl.dist.collect.PCollectionFactory
-
- createInputCollection(Source<S>, String, DistributedPipeline, ParallelDoOptions) - Method in class org.apache.crunch.impl.spark.collect.SparkCollectFactory
-
- createInputTable(TableSource<K, V>, String, DistributedPipeline, ParallelDoOptions) - Method in interface org.apache.crunch.impl.dist.collect.PCollectionFactory
-
- createInputTable(TableSource<K, V>, String, DistributedPipeline, ParallelDoOptions) - Method in class org.apache.crunch.impl.spark.collect.SparkCollectFactory
-
- createIntermediateOutput(PType<T>) - Method in class org.apache.crunch.impl.dist.DistributedPipeline
-
- CreateOptions - Class in org.apache.crunch
-
- createOrcStruct(TypeInfo, Object...) - Static method in class org.apache.crunch.types.orc.OrcUtils
-
Create an object of OrcStruct given a type string and a list of objects
- createOrderedTupleSchema(PType<S>, Sort.ColumnOrder[]) - Static method in class org.apache.crunch.lib.sort.SortFns
-
Constructs an Avro schema for the given PType<S>
that respects the given column
orderings.
- createPut(PTable<String, String>) - Method in class org.apache.crunch.examples.WordAggregationHBase
-
Create puts in order to insert them in hbase.
- createRecordReader(InputSplit, TaskAttemptContext) - Method in class org.apache.crunch.types.avro.AvroInputFormat
-
- createRecordReader(InputSplit, TaskAttemptContext) - Method in class org.apache.crunch.types.avro.AvroUtf8InputFormat
-
- createSourceTarget(Configuration, Path, Iterable<T>, int) - Method in class org.apache.crunch.types.avro.AvroType
-
- createSourceTarget(Configuration, Path, Iterable<T>, int) - Method in interface org.apache.crunch.types.PType
-
Returns a ReadableSource
that contains the data in the given Iterable
.
- createSourceTarget(Configuration, Path, Iterable<T>, int) - Method in class org.apache.crunch.types.writable.WritableType
-
- createTempPath() - Method in class org.apache.crunch.impl.dist.DistributedPipeline
-
- createUnionCollection(List<? extends PCollectionImpl<S>>) - Method in interface org.apache.crunch.impl.dist.collect.PCollectionFactory
-
- createUnionCollection(List<? extends PCollectionImpl<S>>) - Method in class org.apache.crunch.impl.spark.collect.SparkCollectFactory
-
- createUnionTable(List<PTableBase<K, V>>) - Method in interface org.apache.crunch.impl.dist.collect.PCollectionFactory
-
- createUnionTable(List<PTableBase<K, V>>) - Method in class org.apache.crunch.impl.spark.collect.SparkCollectFactory
-
- cross(PTable<K1, U>, PTable<K2, V>) - Static method in class org.apache.crunch.lib.Cartesian
-
Performs a full cross join on the specified
PTable
s (using the same
strategy as Pig's CROSS operator).
- cross(PTable<K1, U>, PTable<K2, V>, int) - Static method in class org.apache.crunch.lib.Cartesian
-
Performs a full cross join on the specified
PTable
s (using the same
strategy as Pig's CROSS operator).
- cross(PCollection<U>, PCollection<V>) - Static method in class org.apache.crunch.lib.Cartesian
-
Performs a full cross join on the specified
PCollection
s (using the
same strategy as Pig's CROSS operator).
- cross(PCollection<U>, PCollection<V>, int) - Static method in class org.apache.crunch.lib.Cartesian
-
Performs a full cross join on the specified
PCollection
s (using the
same strategy as Pig's CROSS operator).
- CRUNCH_DISABLE_OUTPUT_COUNTERS - Static variable in class org.apache.crunch.io.CrunchOutputs
-
- CRUNCH_FILTER_NAME - Static variable in class org.apache.crunch.contrib.bloomfilter.BloomFilterFn
-
- CRUNCH_FILTER_SIZE - Static variable in class org.apache.crunch.contrib.bloomfilter.BloomFilterFn
-
- CRUNCH_INPUTS - Static variable in class org.apache.crunch.io.CrunchInputs
-
- CRUNCH_OUTPUTS - Static variable in class org.apache.crunch.io.CrunchOutputs
-
- CrunchInputs - Class in org.apache.crunch.io
-
Helper functions for configuring multiple InputFormat
instances within a single
Crunch MapReduce job.
- CrunchInputs() - Constructor for class org.apache.crunch.io.CrunchInputs
-
- CrunchIterable<S,T> - Class in org.apache.crunch.impl.spark.fn
-
- CrunchIterable(DoFn<S, T>, Iterator<S>) - Constructor for class org.apache.crunch.impl.spark.fn.CrunchIterable
-
- CrunchOutputs<K,V> - Class in org.apache.crunch.io
-
An analogue of
CrunchInputs
for handling multiple
OutputFormat
instances
writing to multiple files within a single MapReduce job.
- CrunchOutputs(TaskInputOutputContext<?, ?, K, V>) - Constructor for class org.apache.crunch.io.CrunchOutputs
-
Creates and initializes multiple outputs support,
it should be instantiated in the Mapper/Reducer setup method.
- CrunchOutputs(Configuration) - Constructor for class org.apache.crunch.io.CrunchOutputs
-
- CrunchOutputs.OutputConfig<K,V> - Class in org.apache.crunch.io
-
- CrunchPairTuple2<K,V> - Class in org.apache.crunch.impl.spark.fn
-
- CrunchPairTuple2() - Constructor for class org.apache.crunch.impl.spark.fn.CrunchPairTuple2
-
- CrunchRuntimeException - Exception in org.apache.crunch
-
A RuntimeException
implementation that includes some additional options
for the Crunch execution engine to track reporting status.
- CrunchRuntimeException(String) - Constructor for exception org.apache.crunch.CrunchRuntimeException
-
- CrunchRuntimeException(Exception) - Constructor for exception org.apache.crunch.CrunchRuntimeException
-
- CrunchRuntimeException(String, Exception) - Constructor for exception org.apache.crunch.CrunchRuntimeException
-
- CrunchTestSupport - Class in org.apache.crunch.test
-
A temporary workaround for Scala tests to use when working with Rule
annotations until it gets fixed in JUnit 4.11.
- CrunchTestSupport() - Constructor for class org.apache.crunch.test.CrunchTestSupport
-
- CrunchTool - Class in org.apache.crunch.util
-
An extension of the Tool
interface that creates a Pipeline
instance and provides methods for working with the Pipeline from inside of
the Tool's run method.
- CrunchTool() - Constructor for class org.apache.crunch.util.CrunchTool
-
- CrunchTool(boolean) - Constructor for class org.apache.crunch.util.CrunchTool
-
- DataBaseSource<T extends org.apache.hadoop.mapreduce.lib.db.DBWritable & org.apache.hadoop.io.Writable> - Class in org.apache.crunch.contrib.io.jdbc
-
Source from reading from a database via a JDBC connection.
- DataBaseSource.Builder<T extends org.apache.hadoop.mapreduce.lib.db.DBWritable & org.apache.hadoop.io.Writable> - Class in org.apache.crunch.contrib.io.jdbc
-
- DebugLogging - Class in org.apache.crunch.test
-
Allows direct manipulation of the Hadoop log4j settings to aid in
unit testing.
- DeepCopier<T> - Interface in org.apache.crunch.types
-
Performs deep copies of values.
- deepCopy(Object) - Method in class org.apache.crunch.types.avro.AvroDerivedValueDeepCopier
-
- deepCopy(Collection<T>) - Method in class org.apache.crunch.types.CollectionDeepCopier
-
- deepCopy(T) - Method in interface org.apache.crunch.types.DeepCopier
-
Create a deep copy of a value.
- deepCopy(Map<String, T>) - Method in class org.apache.crunch.types.MapDeepCopier
-
- deepCopy(T) - Method in class org.apache.crunch.types.NoOpDeepCopier
-
- deepCopy(T) - Method in class org.apache.crunch.types.TupleDeepCopier
-
- deepCopy(Union) - Method in class org.apache.crunch.types.UnionDeepCopier
-
- deepCopy(T) - Method in class org.apache.crunch.types.writable.WritableDeepCopier
-
- DEFAULT - Static variable in class org.apache.crunch.CachingOptions
-
An instance of CachingOptions
with the default caching settings.
- DEFAULT_BYTES_PER_REDUCE_TASK - Static variable in class org.apache.crunch.util.PartitionUtils
-
- DEFAULT_MAX_REDUCERS - Static variable in class org.apache.crunch.util.PartitionUtils
-
- DEFAULT_PATH - Static variable in class org.apache.crunch.lib.sort.TotalOrderPartitioner
-
- DefaultJoinStrategy<K,U,V> - Class in org.apache.crunch.lib.join
-
Default join strategy that simply sends all data through the map, shuffle, and reduce phase.
- DefaultJoinStrategy() - Constructor for class org.apache.crunch.lib.join.DefaultJoinStrategy
-
- DefaultJoinStrategy(int) - Constructor for class org.apache.crunch.lib.join.DefaultJoinStrategy
-
- DelegatingReadableData<S,T> - Class in org.apache.crunch.util
-
Implements the ReadableData<T>
interface by delegating to an ReadableData<S>
instance
and passing its contents through a DoFn<S, T>
.
- DelegatingReadableData(ReadableData<S>, DoFn<S, T>) - Constructor for class org.apache.crunch.util.DelegatingReadableData
-
- delete() - Method in class org.apache.crunch.test.TemporaryPath
-
- delimiter(String) - Method in class org.apache.crunch.contrib.text.TokenizerFactory.Builder
-
Sets the delimiter used by the TokenizerFactory
instances constructed by
this instance.
- department - Variable in class org.apache.crunch.test.Employee
-
Deprecated.
- dependsOn(String, Target) - Method in class org.apache.crunch.PipelineCallable
-
Requires that the given Target
exists before this instance may be
executed.
- dependsOn(String, PCollection<?>) - Method in class org.apache.crunch.PipelineCallable
-
Requires that the given PCollection
be materialized to disk before this instance may be
executed.
- derived(PType<V1>, PType<V2>, PType<V3>) - Static method in class org.apache.crunch.Tuple3.Collect
-
- derived(PType<V1>, PType<V2>, PType<V3>, PType<V4>) - Static method in class org.apache.crunch.Tuple4.Collect
-
- derived(Class<T>, MapFn<S, T>, MapFn<T, S>, PType<S>) - Static method in class org.apache.crunch.types.avro.Avros
-
- derived(Class<T>, MapFn<S, T>, MapFn<T, S>, PType<S>) - Method in class org.apache.crunch.types.avro.AvroTypeFamily
-
- derived(Class<T>, MapFn<S, T>, MapFn<T, S>, PType<S>) - Method in interface org.apache.crunch.types.PTypeFamily
-
- derived(Class<T>, MapFn<S, T>, MapFn<T, S>, PType<S>) - Static method in class org.apache.crunch.types.writable.Writables
-
- derived(Class<T>, MapFn<S, T>, MapFn<T, S>, PType<S>) - Method in class org.apache.crunch.types.writable.WritableTypeFamily
-
- derivedImmutable(Class<T>, MapFn<S, T>, MapFn<T, S>, PType<S>) - Static method in class org.apache.crunch.types.avro.Avros
-
- derivedImmutable(Class<T>, MapFn<S, T>, MapFn<T, S>, PType<S>) - Method in class org.apache.crunch.types.avro.AvroTypeFamily
-
- derivedImmutable(Class<T>, MapFn<S, T>, MapFn<T, S>, PType<S>) - Method in interface org.apache.crunch.types.PTypeFamily
-
A derived type whose values are immutable.
- derivedImmutable(Class<T>, MapFn<S, T>, MapFn<T, S>, PType<S>) - Static method in class org.apache.crunch.types.writable.Writables
-
- derivedImmutable(Class<T>, MapFn<S, T>, MapFn<T, S>, PType<S>) - Method in class org.apache.crunch.types.writable.WritableTypeFamily
-
- deserialized(boolean) - Method in class org.apache.crunch.CachingOptions.Builder
-
- deserialized() - Method in class org.apache.crunch.CachingOptions
-
Whether the data should remain deserialized in the cache, which trades off CPU processing time
for additional storage overhead.
- detach(DoFn<Pair<K, Iterable<V>>, T>, PType<V>) - Static method in class org.apache.crunch.lib.DoFns
-
"Reduce" DoFn wrapper which detaches the values in the iterable, preventing the unexpected behaviour related to
object reuse often observed when using Avro.
- difference(PCollection<T>, PCollection<T>) - Static method in class org.apache.crunch.lib.Set
-
Compute the set difference between two sets of elements.
- disableDeepCopy() - Method in class org.apache.crunch.DoFn
-
By default, Crunch will do a defensive deep copy of the outputs of a
DoFn when there are multiple downstream consumers of that item, in order to
prevent the downstream functions from making concurrent modifications to
data objects.
- DIST_CACHE_REPLICATION - Static variable in class org.apache.crunch.util.DistCache
-
Configuration key for setting the replication factor for files distributed using the Crunch
DistCache helper class.
- DistCache - Class in org.apache.crunch.util
-
Provides functions for working with Hadoop's distributed cache.
- DistCache() - Constructor for class org.apache.crunch.util.DistCache
-
- Distinct - Class in org.apache.crunch.lib
-
Functions for computing the distinct elements of a PCollection
.
- distinct(PCollection<S>) - Static method in class org.apache.crunch.lib.Distinct
-
Construct a new PCollection
that contains the unique elements of a
given input PCollection
.
- distinct(PTable<K, V>) - Static method in class org.apache.crunch.lib.Distinct
-
A PTable<K, V>
analogue of the distinct
function.
- distinct(PCollection<S>, int) - Static method in class org.apache.crunch.lib.Distinct
-
A distinct
operation that gives the client more control over how frequently
elements are flushed to disk in order to allow control over performance or
memory consumption.
- distinct(PTable<K, V>, int) - Static method in class org.apache.crunch.lib.Distinct
-
A PTable<K, V>
analogue of the distinct
function.
- distributed(PTable<K, V>, double, double...) - Static method in class org.apache.crunch.lib.Quantiles
-
Calculate a set of quantiles for each key in a numerically-valued table.
- DistributedPipeline - Class in org.apache.crunch.impl.dist
-
- DistributedPipeline(String, Configuration, PCollectionFactory) - Constructor for class org.apache.crunch.impl.dist.DistributedPipeline
-
Instantiate with a custom name and configuration.
- DoCollection<S> - Class in org.apache.crunch.impl.spark.collect
-
- DoFn<S,T> - Class in org.apache.crunch
-
Base class for all data processing functions in Crunch.
- DoFn() - Constructor for class org.apache.crunch.DoFn
-
- DoFnIterator<S,T> - Class in org.apache.crunch.util
-
An Iterator<T>
that combines a delegate Iterator<S>
and a DoFn<S, T>
, generating
data by passing the contents of the iterator through the function.
- DoFnIterator(Iterator<S>, DoFn<S, T>) - Constructor for class org.apache.crunch.util.DoFnIterator
-
- DoFns - Class in org.apache.crunch.lib
-
- DoFns() - Constructor for class org.apache.crunch.lib.DoFns
-
- done() - Method in class org.apache.crunch.impl.dist.DistributedPipeline
-
- done() - Method in class org.apache.crunch.impl.mem.MemPipeline
-
- done() - Method in class org.apache.crunch.impl.spark.SparkPipeline
-
- done() - Method in interface org.apache.crunch.Pipeline
-
Run any remaining jobs required to generate outputs and then clean up any
intermediate data files that were created in this run or previous calls to
run
.
- DONE - Static variable in class org.apache.crunch.PipelineResult
-
- done() - Method in class org.apache.crunch.util.CrunchTool
-
- DoTable<K,V> - Class in org.apache.crunch.impl.spark.collect
-
- doubles() - Static method in class org.apache.crunch.types.avro.Avros
-
- doubles() - Method in class org.apache.crunch.types.avro.AvroTypeFamily
-
- doubles() - Method in interface org.apache.crunch.types.PTypeFamily
-
- doubles() - Static method in class org.apache.crunch.types.writable.Writables
-
- doubles() - Method in class org.apache.crunch.types.writable.WritableTypeFamily
-
- drop(Integer...) - Method in class org.apache.crunch.contrib.text.TokenizerFactory.Builder
-
Drop the specified fields found by the input scanner, counting from zero.
- factory() - Method in interface org.apache.crunch.lambda.LCollection
-
- FileNamingScheme - Interface in org.apache.crunch.io
-
Encapsulates rules for naming output files.
- FileReaderFactory<T> - Interface in org.apache.crunch.io
-
- filter(FilterFn<S>) - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
-
- filter(String, FilterFn<S>) - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
-
- filter(FilterFn<Pair<K, V>>) - Method in class org.apache.crunch.impl.dist.collect.PTableBase
-
- filter(String, FilterFn<Pair<K, V>>) - Method in class org.apache.crunch.impl.dist.collect.PTableBase
-
- filter(SPredicate<S>) - Method in interface org.apache.crunch.lambda.LCollection
-
Filter the collection using the supplied predicate.
- filter(SPredicate<Pair<K, V>>) - Method in interface org.apache.crunch.lambda.LTable
-
Filter the rows of the table using the supplied predicate.
- filter(FilterFn<S>) - Method in interface org.apache.crunch.PCollection
-
Apply the given filter function to this instance and return the resulting
PCollection
.
- filter(String, FilterFn<S>) - Method in interface org.apache.crunch.PCollection
-
Apply the given filter function to this instance and return the resulting
PCollection
.
- filter(FilterFn<Pair<K, V>>) - Method in interface org.apache.crunch.PTable
-
Apply the given filter function to this instance and return the resulting
PTable
.
- filter(String, FilterFn<Pair<K, V>>) - Method in interface org.apache.crunch.PTable
-
Apply the given filter function to this instance and return the resulting
PTable
.
- filterByKey(SPredicate<K>) - Method in interface org.apache.crunch.lambda.LTable
-
Filter the rows of the table using the supplied predicate applied to the key part of each record.
- filterByValue(SPredicate<V>) - Method in interface org.apache.crunch.lambda.LTable
-
Filter the rows of the table using the supplied predicate applied to the value part of each record.
- FilterFn<T> - Class in org.apache.crunch
-
A
DoFn
for the common case of filtering the members of a
PCollection
based on a boolean condition.
- FilterFn() - Constructor for class org.apache.crunch.FilterFn
-
- FilterFns - Class in org.apache.crunch.fn
-
A collection of pre-defined
FilterFn
implementations.
- filterMap(SFunction<S, Optional<T>>, PType<T>) - Method in interface org.apache.crunch.lambda.LCollection
-
Combination of a filter and map operation by using a function with
Optional
return type.
- filterMap(SFunction<S, Optional<Pair<K, V>>>, PTableType<K, V>) - Method in interface org.apache.crunch.lambda.LCollection
-
Combination of a filter and map operation by using a function with
Optional
return type.
- findContainingJar(Class<?>) - Static method in class org.apache.crunch.util.DistCache
-
Finds the path to a jar that contains the class provided, if any.
- findCounter(Enum<?>) - Method in class org.apache.crunch.PipelineResult.StageResult
-
- findPartition(K) - Method in class org.apache.crunch.lib.sort.TotalOrderPartitioner.BinarySearchNode
-
- findPartition(T) - Method in interface org.apache.crunch.lib.sort.TotalOrderPartitioner.Node
-
Locate partition in keyset K, st [Ki..Ki+1) defines a partition,
with implicit K0 = -inf, Kn = +inf, and |K| = #partitions - 1.
- first() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
-
- first() - Method in class org.apache.crunch.Pair
-
- first() - Method in interface org.apache.crunch.PCollection
-
- first() - Method in class org.apache.crunch.Tuple3
-
- first() - Method in class org.apache.crunch.Tuple4
-
- FIRST_N(int) - Static method in class org.apache.crunch.fn.Aggregators
-
Return the first n
values (or fewer if there are fewer values than n
).
- flatMap(SFunction<S, Stream<T>>, PType<T>) - Method in interface org.apache.crunch.lambda.LCollection
-
Map each element to zero or more output elements using the provided stream-returning function.
- flatMap(SFunction<S, Stream<Pair<K, V>>>, PTableType<K, V>) - Method in interface org.apache.crunch.lambda.LCollection
-
Map each element to zero or more output elements using the provided stream-returning function to yield an
LTable
- FlatMapIndexFn<S,T> - Class in org.apache.crunch.impl.spark.fn
-
- FlatMapIndexFn(DoFn<S, T>, boolean, SparkRuntimeContext) - Constructor for class org.apache.crunch.impl.spark.fn.FlatMapIndexFn
-
- FlatMapPairDoFn<K,V,T> - Class in org.apache.crunch.impl.spark.fn
-
- FlatMapPairDoFn(DoFn<Pair<K, V>, T>, SparkRuntimeContext) - Constructor for class org.apache.crunch.impl.spark.fn.FlatMapPairDoFn
-
- floats() - Static method in class org.apache.crunch.types.avro.Avros
-
- floats() - Method in class org.apache.crunch.types.avro.AvroTypeFamily
-
- floats() - Method in interface org.apache.crunch.types.PTypeFamily
-
- floats() - Static method in class org.apache.crunch.types.writable.Writables
-
- floats() - Method in class org.apache.crunch.types.writable.WritableTypeFamily
-
- flush() - Method in interface org.apache.crunch.Emitter
-
Flushes any values cached by this emitter.
- forAvroSchema(Schema) - Static method in class org.apache.crunch.impl.spark.ByteArrayHelper
-
- forInput(Class<T>) - Static method in class org.apache.crunch.io.FormatBundle
-
- FormatBundle<K> - Class in org.apache.crunch.io
-
A combination of an InputFormat
or OutputFormat
and any extra
configuration information that format class needs to run.
- FormatBundle() - Constructor for class org.apache.crunch.io.FormatBundle
-
- formattedFile(String, Class<? extends FileInputFormat<K, V>>, Class<K>, Class<V>) - Static method in class org.apache.crunch.io.From
-
Creates a TableSource<K, V>
for reading data from files that have custom
FileInputFormat<K, V>
implementations not covered by the provided TableSource
and Source
factory methods.
- formattedFile(Path, Class<? extends FileInputFormat<K, V>>, Class<K>, Class<V>) - Static method in class org.apache.crunch.io.From
-
Creates a TableSource<K, V>
for reading data from files that have custom
FileInputFormat<K, V>
implementations not covered by the provided TableSource
and Source
factory methods.
- formattedFile(List<Path>, Class<? extends FileInputFormat<K, V>>, Class<K>, Class<V>) - Static method in class org.apache.crunch.io.From
-
Creates a TableSource<K, V>
for reading data from files that have custom
FileInputFormat<K, V>
implementations not covered by the provided TableSource
and Source
factory methods.
- formattedFile(String, Class<? extends FileInputFormat<?, ?>>, PType<K>, PType<V>) - Static method in class org.apache.crunch.io.From
-
Creates a TableSource<K, V>
for reading data from files that have custom
FileInputFormat
implementations not covered by the provided TableSource
and Source
factory methods.
- formattedFile(Path, Class<? extends FileInputFormat<?, ?>>, PType<K>, PType<V>) - Static method in class org.apache.crunch.io.From
-
Creates a TableSource<K, V>
for reading data from files that have custom
FileInputFormat
implementations not covered by the provided TableSource
and Source
factory methods.
- formattedFile(List<Path>, Class<? extends FileInputFormat<?, ?>>, PType<K>, PType<V>) - Static method in class org.apache.crunch.io.From
-
Creates a TableSource<K, V>
for reading data from files that have custom
FileInputFormat
implementations not covered by the provided TableSource
and Source
factory methods.
- formattedFile(String, Class<? extends FileOutputFormat<K, V>>) - Static method in class org.apache.crunch.io.To
-
Creates a Target
at the given path name that writes data to
a custom FileOutputFormat
.
- formattedFile(Path, Class<? extends FileOutputFormat<K, V>>) - Static method in class org.apache.crunch.io.To
-
Creates a Target
at the given Path
that writes data to
a custom FileOutputFormat
.
- forOutput(Class<T>) - Static method in class org.apache.crunch.io.FormatBundle
-
- fourth() - Method in class org.apache.crunch.Tuple4
-
- From - Class in org.apache.crunch.io
-
Static factory methods for creating common
Source
types.
- From() - Constructor for class org.apache.crunch.io.From
-
- fromBytes(byte[]) - Method in class org.apache.crunch.impl.spark.serde.AvroSerDe
-
- fromBytes(byte[]) - Method in interface org.apache.crunch.impl.spark.serde.SerDe
-
- fromBytes(byte[]) - Method in class org.apache.crunch.impl.spark.serde.WritableSerDe
-
- fromBytesFunction() - Method in class org.apache.crunch.impl.spark.serde.AvroSerDe
-
- fromBytesFunction() - Method in interface org.apache.crunch.impl.spark.serde.SerDe
-
- fromBytesFunction() - Method in class org.apache.crunch.impl.spark.serde.WritableSerDe
-
- fromConfiguration(Configuration) - Static method in class org.apache.crunch.types.avro.AvroMode
-
- fromSerialized(String, Configuration) - Static method in class org.apache.crunch.io.FormatBundle
-
- fromShuffleConfiguration(Configuration) - Static method in class org.apache.crunch.types.avro.AvroMode
-
- fromType(AvroType<?>) - Static method in class org.apache.crunch.types.avro.AvroMode
-
Creates an
AvroMode
based upon the specified
type
.
- fullJoin(PTable<K, U>, PTable<K, V>) - Static method in class org.apache.crunch.lib.Join
-
Performs a full outer join on the specified
PTable
s.
- FullOuterJoinFn<K,U,V> - Class in org.apache.crunch.lib.join
-
Used to perform the last step of an full outer join.
- FullOuterJoinFn(PType<K>, PType<U>) - Constructor for class org.apache.crunch.lib.join.FullOuterJoinFn
-
- generateKeys(S) - Method in class org.apache.crunch.contrib.bloomfilter.BloomFilterFn
-
- generateOutput(Pipeline) - Method in class org.apache.crunch.PipelineCallable
-
Called by the Pipeline
when this instance is registered with Pipeline#sequentialDo
.
- GENERIC - Static variable in class org.apache.crunch.types.avro.AvroMode
-
Default mode to use for reading and writing Generic
types.
- generics(Schema) - Static method in class org.apache.crunch.types.avro.Avros
-
- generics(Schema) - Method in class org.apache.crunch.types.avro.AvroTypeFamily
-
- get() - Method in class org.apache.crunch.impl.spark.SparkRuntime
-
- get(long, TimeUnit) - Method in class org.apache.crunch.impl.spark.SparkRuntime
-
- get(int) - Method in class org.apache.crunch.Pair
-
- get(int) - Method in class org.apache.crunch.test.Employee
-
- get(int) - Method in class org.apache.crunch.test.Person
-
- get(int) - Method in interface org.apache.crunch.Tuple
-
Returns the Object at the given index.
- get(int) - Method in class org.apache.crunch.Tuple3
-
- get(int) - Method in class org.apache.crunch.Tuple4
-
- get(int) - Method in class org.apache.crunch.TupleN
-
- get(int) - Method in class org.apache.crunch.types.writable.TupleWritable
-
Get ith Writable from Tuple.
- getAge() - Method in class org.apache.crunch.test.Person.Builder
-
Gets the value of the 'age' field
- getAge() - Method in class org.apache.crunch.test.Person
-
Gets the value of the 'age' field.
- getAllPCollections() - Method in class org.apache.crunch.PipelineCallable
-
Returns the mapping of labels to PCollection dependencies for this instance.
- getAllStructFieldRefs() - Method in class org.apache.crunch.types.orc.TupleObjectInspector
-
- getAllTargets() - Method in class org.apache.crunch.PipelineCallable
-
Returns the mapping of labels to Target dependencies for this instance.
- getByFn() - Method in class org.apache.crunch.lib.sort.SortFns.KeyExtraction
-
- getCategory() - Method in class org.apache.crunch.types.orc.TupleObjectInspector
-
- getClassSchema() - Static method in class org.apache.crunch.test.Employee
-
- getClassSchema() - Static method in class org.apache.crunch.test.Person
-
- getCombineFn() - Method in class org.apache.crunch.impl.spark.SparkRuntime
-
- getCompletionHooks() - Method in class org.apache.crunch.impl.mr.MRPipeline
-
- getConf() - Method in class org.apache.crunch.io.FormatBundle
-
- getConf() - Method in class org.apache.crunch.lib.sort.TotalOrderPartitioner
-
- getConf() - Method in class org.apache.crunch.lib.sort.TupleWritableComparator
-
- getConf() - Method in class org.apache.crunch.types.writable.TupleWritable.Comparator
-
- getConf() - Method in class org.apache.crunch.util.CrunchTool
-
- getConfiguration() - Method in class org.apache.crunch.impl.dist.DistributedPipeline
-
- getConfiguration() - Method in class org.apache.crunch.impl.mem.MemPipeline
-
- getConfiguration() - Method in class org.apache.crunch.impl.spark.SparkRuntime
-
- getConfiguration() - Method in class org.apache.crunch.impl.spark.SparkRuntimeContext
-
- getConfiguration() - Method in interface org.apache.crunch.lambda.LDoFnContext
-
Get the current Hadoop Configuration
- getConfiguration() - Method in interface org.apache.crunch.Pipeline
-
Returns the Configuration
instance associated with this pipeline.
- getContext() - Method in interface org.apache.crunch.lambda.LDoFnContext
-
Get the underlying TaskInputOutputContext
(for special cases)
- getConverter() - Method in interface org.apache.crunch.Source
-
Returns the Converter
used for mapping the inputs from this instance
into PCollection
or PTable
values.
- getConverter(PType<?>) - Method in interface org.apache.crunch.Target
-
Returns the Converter
to use for mapping from the output PCollection
into the output values expected by this instance.
- getConverter() - Method in class org.apache.crunch.types.avro.AvroType
-
- getConverter() - Method in class org.apache.crunch.types.PGroupedTableType
-
- getConverter() - Method in interface org.apache.crunch.types.PType
-
- getConverter() - Method in class org.apache.crunch.types.writable.WritableType
-
- getCounter(Enum<?>) - Static method in class org.apache.crunch.test.TestCounters
-
- getCounter(String, String) - Static method in class org.apache.crunch.test.TestCounters
-
- getCounter() - Method in class org.apache.hadoop.mapred.SparkCounter
-
- getCounterDisplayName(String, String) - Method in class org.apache.crunch.PipelineResult.StageResult
-
- getCounterDisplayName(Enum<?>) - Method in class org.apache.crunch.PipelineResult.StageResult
-
- getCounterNames() - Method in class org.apache.crunch.PipelineResult.StageResult
-
- getCounters() - Static method in class org.apache.crunch.impl.mem.MemPipeline
-
- getCounters() - Method in class org.apache.crunch.PipelineResult.StageResult
-
- getCounterValue(String, String) - Method in class org.apache.crunch.PipelineResult.StageResult
-
- getCounterValue(Enum<?>) - Method in class org.apache.crunch.PipelineResult.StageResult
-
- getData() - Method in class org.apache.crunch.types.avro.AvroMode
-
Returns a GenericData
instance based on the mode type.
- getData() - Method in interface org.apache.crunch.types.avro.ReaderWriterFactory
-
- getData() - Method in class org.apache.crunch.types.avro.ReflectDataFactory
-
- getDataFileWriter(Path, Configuration) - Static method in class org.apache.crunch.types.avro.AvroOutputFormat
-
- getDefaultConfiguration() - Method in class org.apache.crunch.test.TemporaryPath
-
- getDefaultFileSource(Path) - Method in class org.apache.crunch.types.avro.AvroType
-
- getDefaultFileSource(Path) - Method in class org.apache.crunch.types.PGroupedTableType
-
- getDefaultFileSource(Path) - Method in interface org.apache.crunch.types.PType
-
Returns a SourceTarget
that is able to read/write data using the serialization format
specified by this PType
.
- getDefaultFileSource(Path) - Method in class org.apache.crunch.types.writable.WritableType
-
- getDefaultInstance() - Static method in class org.apache.crunch.contrib.text.TokenizerFactory
-
Returns a default TokenizerFactory
that uses whitespace as a delimiter and does
not skip any input fields.
- getDefaultInstance(Class<M>) - Static method in class org.apache.crunch.types.Protos
-
Utility function for creating a default PB Messgae from a Class object that
works with both protoc 2.3.0 and 2.4.x.
- getDefaultValue() - Method in class org.apache.crunch.contrib.text.AbstractSimpleExtractor
-
- getDefaultValue() - Method in interface org.apache.crunch.contrib.text.Extractor
-
Returns the default value for this Extractor
in case of an
error.
- getDepartment() - Method in class org.apache.crunch.test.Employee.Builder
-
Gets the value of the 'department' field
- getDepartment() - Method in class org.apache.crunch.test.Employee
-
Gets the value of the 'department' field.
- getDependentJobs() - Method in interface org.apache.crunch.impl.mr.MRJob
-
- getDepth() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
-
- getDetachedValue(PTableType<K, V>, Pair<K, V>) - Static method in class org.apache.crunch.lib.PTables
-
Create a detached value for a table
Pair
.
- getDetachedValue(T) - Method in class org.apache.crunch.types.avro.AvroType
-
- getDetachedValue(T) - Method in interface org.apache.crunch.types.PType
-
Returns a copy of a value (or the value itself) that can safely be retained.
- getDetachedValue(T) - Method in class org.apache.crunch.types.writable.WritableType
-
- getDisplayName() - Method in class org.apache.hadoop.mapred.SparkCounter
-
- getEndTimeMsec() - Method in class org.apache.crunch.PipelineResult.StageResult
-
- getErrorCount() - Method in class org.apache.crunch.contrib.text.ExtractorStats
-
The overall number of records that had some kind of parsing error.
- getFactory() - Method in class org.apache.crunch.impl.dist.DistributedPipeline
-
- getFactory() - Method in class org.apache.crunch.types.avro.AvroMode
-
Returns the factory that will be used for the mode.
- getFamily() - Method in class org.apache.crunch.types.avro.AvroType
-
- getFamily() - Method in class org.apache.crunch.types.PGroupedTableType
-
- getFamily() - Method in interface org.apache.crunch.types.PType
-
Returns the PTypeFamily
that this PType
belongs to.
- getFamily() - Method in class org.apache.crunch.types.writable.WritableType
-
- getFieldErrors() - Method in class org.apache.crunch.contrib.text.ExtractorStats
-
Returns the number of errors that occurred when parsing the individual fields of
a composite record type, like a Pair
or TupleN
.
- getFile(String) - Method in class org.apache.crunch.test.TemporaryPath
-
Get a
File
below the temporary directory.
- getFileName(String) - Method in class org.apache.crunch.test.TemporaryPath
-
Get an absolute file name below the temporary directory.
- getFileNamingScheme() - Method in interface org.apache.crunch.io.PathTarget
-
Get the naming scheme to be used for outputs being written to an output
path.
- getFirst() - Method in class org.apache.crunch.fn.CompositeMapFn
-
- getFormatClass() - Method in class org.apache.crunch.io.FormatBundle
-
- getFormatNodeMap(JobContext) - Static method in class org.apache.crunch.io.CrunchInputs
-
- getGroupedDetachedValue(PGroupedTableType<K, V>, Pair<K, Iterable<V>>) - Static method in class org.apache.crunch.lib.PTables
-
- getGroupedTableType() - Method in class org.apache.crunch.impl.dist.collect.BaseGroupedTable
-
- getGroupedTableType() - Method in interface org.apache.crunch.PGroupedTable
-
Return the PGroupedTableType
containing serialization information for
this PGroupedTable
.
- getGroupedTableType() - Method in interface org.apache.crunch.types.PTableType
-
Returns the grouped table version of this type.
- getGroupingComparator(PTypeFamily) - Static method in class org.apache.crunch.lib.join.JoinUtils
-
- getGroupingComparatorClass() - Method in class org.apache.crunch.GroupingOptions
-
- getGroupingConverter() - Method in class org.apache.crunch.types.PGroupedTableType
-
- getIndex() - Method in class org.apache.crunch.types.writable.UnionWritable
-
- getIndex() - Method in class org.apache.crunch.Union
-
Returns the index of the original data source for this union type.
- getInputMapFn() - Method in class org.apache.crunch.types.avro.AvroType
-
- getInputMapFn() - Method in interface org.apache.crunch.types.PType
-
- getInputMapFn() - Method in class org.apache.crunch.types.writable.WritableType
-
- getInstance() - Static method in class org.apache.crunch.fn.IdentityFn
-
- getInstance() - Static method in class org.apache.crunch.impl.mem.MemPipeline
-
- getInstance() - Static method in class org.apache.crunch.io.SequentialFileNamingScheme
-
- getInstance() - Static method in class org.apache.crunch.types.avro.AvroTypeFamily
-
- getInstance() - Static method in class org.apache.crunch.types.writable.TupleWritable.Comparator
-
- getInstance() - Static method in class org.apache.crunch.types.writable.WritableTypeFamily
-
- getJavaRDDLike(SparkRuntime) - Method in class org.apache.crunch.impl.spark.collect.CreatedCollection
-
- getJavaRDDLike(SparkRuntime) - Method in class org.apache.crunch.impl.spark.collect.CreatedTable
-
- getJavaRDDLike(SparkRuntime) - Method in class org.apache.crunch.impl.spark.collect.DoCollection
-
- getJavaRDDLike(SparkRuntime) - Method in class org.apache.crunch.impl.spark.collect.DoTable
-
- getJavaRDDLike(SparkRuntime) - Method in class org.apache.crunch.impl.spark.collect.EmptyPCollection
-
- getJavaRDDLike(SparkRuntime) - Method in class org.apache.crunch.impl.spark.collect.EmptyPTable
-
- getJavaRDDLike(SparkRuntime) - Method in class org.apache.crunch.impl.spark.collect.InputCollection
-
- getJavaRDDLike(SparkRuntime) - Method in class org.apache.crunch.impl.spark.collect.InputTable
-
- getJavaRDDLike(SparkRuntime) - Method in class org.apache.crunch.impl.spark.collect.PGroupedTableImpl
-
- getJavaRDDLike(SparkRuntime) - Method in class org.apache.crunch.impl.spark.collect.UnionCollection
-
- getJavaRDDLike(SparkRuntime) - Method in class org.apache.crunch.impl.spark.collect.UnionTable
-
- getJavaRDDLike(SparkRuntime) - Method in interface org.apache.crunch.impl.spark.SparkCollection
-
- getJob() - Method in interface org.apache.crunch.impl.mr.MRJob
-
- getJobEndTimeMsec() - Method in class org.apache.crunch.PipelineResult.StageResult
-
- getJobID() - Method in interface org.apache.crunch.impl.mr.MRJob
-
- getJobs() - Method in interface org.apache.crunch.impl.mr.MRPipelineExecution
-
- getJobStartTimeMsec() - Method in class org.apache.crunch.PipelineResult.StageResult
-
- getJobState() - Method in interface org.apache.crunch.impl.mr.MRJob
-
- getJoinType() - Method in class org.apache.crunch.lib.join.FullOuterJoinFn
- getJoinType() - Method in class org.apache.crunch.lib.join.InnerJoinFn
-
- getJoinType() - Method in class org.apache.crunch.lib.join.JoinFn
-
- getJoinType() - Method in class org.apache.crunch.lib.join.LeftOuterJoinFn
- getJoinType() - Method in class org.apache.crunch.lib.join.RightOuterJoinFn
- getKeyClass() - Method in interface org.apache.crunch.types.Converter
-
- getKeyType() - Method in class org.apache.crunch.impl.dist.collect.PTableBase
-
- getKeyType() - Method in class org.apache.crunch.lib.sort.SortFns.KeyExtraction
-
- getKeyType() - Method in interface org.apache.crunch.PTable
-
Returns the PType
of the key.
- getKeyType() - Method in interface org.apache.crunch.types.PTableType
-
Returns the key type for the table.
- getLastModifiedAt(Configuration) - Method in class org.apache.crunch.contrib.io.jdbc.DataBaseSource
-
- getLastModifiedAt() - Method in class org.apache.crunch.impl.dist.collect.BaseDoCollection
-
- getLastModifiedAt() - Method in class org.apache.crunch.impl.dist.collect.BaseDoTable
-
- getLastModifiedAt() - Method in class org.apache.crunch.impl.dist.collect.BaseGroupedTable
-
- getLastModifiedAt() - Method in class org.apache.crunch.impl.dist.collect.BaseInputCollection
-
- getLastModifiedAt() - Method in class org.apache.crunch.impl.dist.collect.BaseInputTable
-
- getLastModifiedAt() - Method in class org.apache.crunch.impl.dist.collect.BaseUnionCollection
-
- getLastModifiedAt() - Method in class org.apache.crunch.impl.dist.collect.BaseUnionTable
-
- getLastModifiedAt() - Method in class org.apache.crunch.impl.dist.collect.EmptyPCollection
-
- getLastModifiedAt() - Method in class org.apache.crunch.impl.dist.collect.EmptyPTable
-
- getLastModifiedAt() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
-
The time of the most recent modification to one of the input sources to the collection.
- getLastModifiedAt() - Method in class org.apache.crunch.impl.spark.collect.CreatedCollection
-
- getLastModifiedAt() - Method in class org.apache.crunch.impl.spark.collect.CreatedTable
-
- getLastModifiedAt(FileSystem, Path) - Static method in class org.apache.crunch.io.SourceTargetHelper
-
- getLastModifiedAt(Configuration) - Method in interface org.apache.crunch.Source
-
Returns the time (in milliseconds) that this Source
was most recently
modified (e.g., because an input file was edited or new files were added to
a directory.)
- getMapOutputName(Configuration, Path) - Method in interface org.apache.crunch.io.FileNamingScheme
-
Get the output file name for a map task.
- getMapOutputName(Configuration, Path) - Method in class org.apache.crunch.io.SequentialFileNamingScheme
-
- getMaterializedAt() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
-
- getMaterializeSourceTarget(PCollection<T>) - Method in class org.apache.crunch.impl.dist.DistributedPipeline
-
Retrieve a ReadableSourceTarget that provides access to the contents of a
PCollection
.
- getMessage() - Method in class org.apache.crunch.PipelineCallable
-
Returns a message associated with this callable's execution, especially in case of errors.
- getModeProperties() - Method in class org.apache.crunch.types.avro.AvroMode
-
Returns the entries that a Configuration
instance needs to enable
this AvroMode as a serializable map of key-value pairs.
- getName() - Method in class org.apache.crunch.CreateOptions
-
- getName() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
-
- getName() - Method in class org.apache.crunch.impl.dist.DistributedPipeline
-
- getName() - Method in class org.apache.crunch.impl.mem.MemPipeline
-
- getName() - Method in class org.apache.crunch.io.FormatBundle
-
- getName() - Method in interface org.apache.crunch.PCollection
-
Returns a shorthand name for this PCollection.
- getName() - Method in interface org.apache.crunch.Pipeline
-
Returns the name of this pipeline.
- getName() - Method in class org.apache.crunch.PipelineCallable
-
Returns the name of this instance.
- getName() - Method in class org.apache.crunch.test.Employee.Builder
-
Gets the value of the 'name' field
- getName() - Method in class org.apache.crunch.test.Employee
-
Gets the value of the 'name' field.
- getName() - Method in class org.apache.crunch.test.Person.Builder
-
Gets the value of the 'name' field
- getName() - Method in class org.apache.crunch.test.Person
-
Gets the value of the 'name' field.
- getName() - Method in class org.apache.hadoop.mapred.SparkCounter
-
- getNamedDotFiles() - Method in class org.apache.crunch.impl.spark.SparkRuntime
-
- getNamedDotFiles() - Method in interface org.apache.crunch.PipelineExecution
-
Returns all .dot files that allows a client to graph the Crunch execution plan internals.
- getNamedOutputs(Configuration) - Static method in class org.apache.crunch.io.CrunchOutputs
-
- getNextAnonymousStageId() - Method in class org.apache.crunch.impl.dist.DistributedPipeline
-
- getNumReducers() - Method in class org.apache.crunch.GroupingOptions
-
- getNumShards(K) - Method in interface org.apache.crunch.lib.join.ShardedJoinStrategy.ShardingStrategy
-
Retrieve the number of shards over which the given key should be split.
- getOnlyParent() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
-
- getOutputCommitter(TaskAttemptContext) - Static method in class org.apache.crunch.io.CrunchOutputs
-
- getOutputMapFn() - Method in class org.apache.crunch.types.avro.AvroType
-
- getOutputMapFn() - Method in interface org.apache.crunch.types.PType
-
- getOutputMapFn() - Method in class org.apache.crunch.types.writable.WritableType
-
- getParallelDoOptions() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
-
- getParallelism() - Method in class org.apache.crunch.CreateOptions
-
- getParents() - Method in class org.apache.crunch.impl.dist.collect.BaseDoCollection
-
- getParents() - Method in class org.apache.crunch.impl.dist.collect.BaseDoTable
-
- getParents() - Method in class org.apache.crunch.impl.dist.collect.BaseGroupedTable
-
- getParents() - Method in class org.apache.crunch.impl.dist.collect.BaseInputCollection
-
- getParents() - Method in class org.apache.crunch.impl.dist.collect.BaseInputTable
-
- getParents() - Method in class org.apache.crunch.impl.dist.collect.BaseUnionCollection
-
- getParents() - Method in class org.apache.crunch.impl.dist.collect.BaseUnionTable
-
- getParents() - Method in class org.apache.crunch.impl.dist.collect.EmptyPCollection
-
- getParents() - Method in class org.apache.crunch.impl.dist.collect.EmptyPTable
-
- getParents() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
-
- getParents() - Method in class org.apache.crunch.impl.spark.collect.CreatedCollection
-
- getParents() - Method in class org.apache.crunch.impl.spark.collect.CreatedTable
-
- getPartition(Object) - Method in class org.apache.crunch.impl.spark.SparkPartitioner
-
- getPartition(Object, Object, int) - Method in class org.apache.crunch.lib.join.JoinUtils.AvroIndexedRecordPartitioner
-
- getPartition(TupleWritable, Writable, int) - Method in class org.apache.crunch.lib.join.JoinUtils.TupleWritablePartitioner
-
- getPartition(K, V, int) - Method in class org.apache.crunch.lib.sort.TotalOrderPartitioner
-
- getPartitionerClass() - Method in class org.apache.crunch.GroupingOptions
-
- getPartitionerClass(PTypeFamily) - Static method in class org.apache.crunch.lib.join.JoinUtils
-
- getPartitionFile(Configuration) - Static method in class org.apache.crunch.lib.sort.TotalOrderPartitioner
-
- getPath() - Method in interface org.apache.crunch.io.PathTarget
-
- getPath(String) - Method in class org.apache.crunch.test.TemporaryPath
-
Get a Path
below the temporary directory.
- getPathSize(Configuration, Path) - Static method in class org.apache.crunch.io.SourceTargetHelper
-
- getPathSize(FileSystem, Path) - Static method in class org.apache.crunch.io.SourceTargetHelper
-
- getPathToCacheFile(Path, Configuration) - Static method in class org.apache.crunch.util.DistCache
-
- getPipeline() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
-
- getPipeline() - Method in interface org.apache.crunch.PCollection
-
Returns the Pipeline
associated with this PCollection.
- getPlanDotFile() - Method in class org.apache.crunch.impl.spark.SparkRuntime
-
- getPlanDotFile() - Method in interface org.apache.crunch.PipelineExecution
-
Returns the .dot file that allows a client to graph the Crunch execution plan for this
pipeline.
- getPrepareHooks() - Method in class org.apache.crunch.impl.mr.MRPipeline
-
- getPrimitiveType(Class<T>) - Static method in class org.apache.crunch.types.avro.Avros
-
- getPrimitiveType(Class<T>) - Static method in class org.apache.crunch.types.writable.Writables
-
- getPTableType() - Method in class org.apache.crunch.impl.dist.collect.BaseDoTable
-
- getPTableType() - Method in class org.apache.crunch.impl.dist.collect.BaseInputTable
-
- getPTableType() - Method in class org.apache.crunch.impl.dist.collect.BaseUnionTable
-
- getPTableType() - Method in class org.apache.crunch.impl.dist.collect.EmptyPTable
-
- getPTableType() - Method in class org.apache.crunch.impl.spark.collect.CreatedTable
-
- getPTableType() - Method in interface org.apache.crunch.PTable
-
Returns the PTableType
of this PTable
.
- getPType(PTypeFamily) - Method in interface org.apache.crunch.contrib.text.Extractor
-
Returns the PType
associated with this data type for the
given PTypeFamily
.
- getPType() - Method in class org.apache.crunch.impl.dist.collect.BaseDoCollection
-
- getPType() - Method in class org.apache.crunch.impl.dist.collect.BaseDoTable
-
- getPType() - Method in class org.apache.crunch.impl.dist.collect.BaseGroupedTable
-
- getPType() - Method in class org.apache.crunch.impl.dist.collect.BaseInputCollection
-
- getPType() - Method in class org.apache.crunch.impl.dist.collect.BaseInputTable
-
- getPType() - Method in class org.apache.crunch.impl.dist.collect.BaseUnionCollection
-
- getPType() - Method in class org.apache.crunch.impl.dist.collect.BaseUnionTable
-
- getPType() - Method in class org.apache.crunch.impl.dist.collect.EmptyPCollection
-
- getPType() - Method in class org.apache.crunch.impl.dist.collect.EmptyPTable
-
- getPType() - Method in class org.apache.crunch.impl.spark.collect.CreatedCollection
-
- getPType() - Method in class org.apache.crunch.impl.spark.collect.CreatedTable
-
- getPType() - Method in interface org.apache.crunch.PCollection
-
Returns the PType
of this PCollection
.
- getReader(Schema) - Method in class org.apache.crunch.types.avro.AvroMode
-
Creates a DatumReader
based on the schema
.
- getReader(Schema) - Method in interface org.apache.crunch.types.avro.ReaderWriterFactory
-
- getReader(Schema) - Method in class org.apache.crunch.types.avro.ReflectDataFactory
-
- getRecommendedPartitions(PCollection<T>) - Static method in class org.apache.crunch.util.PartitionUtils
-
- getRecommendedPartitions(PCollection<T>, Configuration) - Static method in class org.apache.crunch.util.PartitionUtils
-
- getRecordType() - Method in class org.apache.crunch.types.avro.AvroType
-
- getRecordWriter(TaskAttemptContext) - Method in class org.apache.crunch.types.avro.AvroOutputFormat
-
- getRecordWriter(TaskAttemptContext) - Method in class org.apache.crunch.types.avro.AvroPathPerKeyOutputFormat
-
- getRecordWriter(TaskAttemptContext) - Method in class org.apache.crunch.types.avro.AvroTextOutputFormat
-
- getReduceOutputName(Configuration, Path, int) - Method in interface org.apache.crunch.io.FileNamingScheme
-
Get the output file name for a reduce task.
- getReduceOutputName(Configuration, Path, int) - Method in class org.apache.crunch.io.SequentialFileNamingScheme
-
- getReflectDataFactory(Configuration) - Static method in class org.apache.crunch.types.avro.Avros
-
- getResult() - Method in class org.apache.crunch.impl.spark.SparkRuntime
-
- getResult() - Method in interface org.apache.crunch.PipelineExecution
-
Retrieve the result of a pipeline if it has been completed, otherwise null
.
- getRootFile() - Method in class org.apache.crunch.test.TemporaryPath
-
Get the root directory which will be deleted automatically.
- getRootFileName() - Method in class org.apache.crunch.test.TemporaryPath
-
Get the root directory as an absolute file name.
- getRootPath() - Method in class org.apache.crunch.test.TemporaryPath
-
Get the root directory as a Path
.
- getRuntimeContext() - Method in class org.apache.crunch.impl.spark.SparkRuntime
-
- getSalary() - Method in class org.apache.crunch.test.Employee.Builder
-
Gets the value of the 'salary' field
- getSalary() - Method in class org.apache.crunch.test.Employee
-
Gets the value of the 'salary' field.
- getSchema() - Method in class org.apache.crunch.test.Employee
-
- getSchema() - Method in class org.apache.crunch.test.Person
-
- getSchema() - Method in class org.apache.crunch.types.avro.AvroType
-
- getSecond() - Method in class org.apache.crunch.fn.CompositeMapFn
-
- getSerializationClass() - Method in class org.apache.crunch.types.writable.WritableType
-
- getSiblingnames() - Method in class org.apache.crunch.test.Person.Builder
-
Gets the value of the 'siblingnames' field
- getSiblingnames() - Method in class org.apache.crunch.test.Person
-
Gets the value of the 'siblingnames' field.
- getSize(Configuration) - Method in class org.apache.crunch.contrib.io.jdbc.DataBaseSource
-
- getSize() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
-
- getSize() - Method in interface org.apache.crunch.PCollection
-
Returns the size of the data represented by this PCollection
in
bytes.
- getSize(Configuration) - Method in interface org.apache.crunch.Source
-
Returns the number of bytes in this Source
.
- getSortComparatorClass() - Method in class org.apache.crunch.GroupingOptions
-
- getSource() - Method in class org.apache.crunch.impl.dist.collect.BaseInputCollection
-
- getSource() - Method in class org.apache.crunch.impl.dist.collect.BaseInputTable
-
- getSourceTargets() - Method in class org.apache.crunch.GroupingOptions
-
- getSourceTargets() - Method in class org.apache.crunch.ParallelDoOptions
-
Deprecated.
- getSourceTargets() - Method in interface org.apache.crunch.ReadableData
-
- getSourceTargets() - Method in class org.apache.crunch.util.DelegatingReadableData
-
- getSourceTargets() - Method in class org.apache.crunch.util.UnionReadableData
-
- getSparkContext() - Method in class org.apache.crunch.impl.spark.SparkRuntime
-
- getSpecificClassLoader() - Static method in class org.apache.crunch.types.avro.AvroMode
-
Get the configured ClassLoader
to be used for loading Avro org.apache.specific.SpecificRecord
and reflection implementation classes.
- getStageId() - Method in class org.apache.crunch.PipelineResult.StageResult
-
- getStageName() - Method in class org.apache.crunch.PipelineResult.StageResult
-
- getStageResults() - Method in class org.apache.crunch.PipelineResult
-
- getStartTimeMsec() - Method in class org.apache.crunch.PipelineResult.StageResult
-
- getStats() - Method in class org.apache.crunch.contrib.text.AbstractCompositeExtractor
-
- getStats() - Method in class org.apache.crunch.contrib.text.AbstractSimpleExtractor
-
- getStats() - Method in interface org.apache.crunch.contrib.text.Extractor
-
Return statistics about how many errors this Extractor
instance
encountered while parsing input data.
- getStatus() - Method in class org.apache.crunch.impl.spark.SparkRuntime
-
- getStatus() - Method in interface org.apache.crunch.PipelineExecution
-
- getStorageLevel(PCollection<?>) - Method in class org.apache.crunch.impl.spark.SparkRuntime
-
- getStructFieldData(Object, StructField) - Method in class org.apache.crunch.types.orc.TupleObjectInspector
-
- getStructFieldRef(String) - Method in class org.apache.crunch.types.orc.TupleObjectInspector
-
- getStructFieldsDataAsList(Object) - Method in class org.apache.crunch.types.orc.TupleObjectInspector
-
- getSubTypes() - Method in class org.apache.crunch.types.avro.AvroType
-
- getSubTypes() - Method in class org.apache.crunch.types.PGroupedTableType
-
- getSubTypes() - Method in interface org.apache.crunch.types.PType
-
Returns the sub-types that make up this PType if it is a composite instance, such as a tuple.
- getSubTypes() - Method in class org.apache.crunch.types.writable.WritableType
-
- getTableType() - Method in interface org.apache.crunch.TableSource
-
- getTableType() - Method in class org.apache.crunch.types.PGroupedTableType
-
- getTargetDependencies() - Method in class org.apache.crunch.impl.dist.collect.BaseGroupedTable
-
- getTargetDependencies() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
-
- getTargets() - Method in class org.apache.crunch.ParallelDoOptions
-
- getTestContext(Configuration) - Static method in class org.apache.crunch.test.CrunchTestSupport
-
The method creates a TaskInputOutputContext which can be used
in unit tests.
- getTupleFactory(Class<T>) - Static method in class org.apache.crunch.types.TupleFactory
-
- getType() - Method in interface org.apache.crunch.Source
-
Returns the PType
for this source.
- getTypeClass() - Method in class org.apache.crunch.types.avro.AvroType
-
- getTypeClass() - Method in interface org.apache.crunch.types.PType
-
Returns the Java type represented by this PType
.
- getTypeClass() - Method in class org.apache.crunch.types.writable.WritableType
-
- getTypeFamily() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
-
- getTypeFamily() - Method in interface org.apache.crunch.PCollection
-
Returns the PTypeFamily
of this PCollection
.
- getTypeInfo(Class<?>) - Static method in class org.apache.crunch.types.orc.OrcUtils
-
Generate TypeInfo for a given java class based on reflection
- getTypeName() - Method in class org.apache.crunch.types.orc.TupleObjectInspector
-
- getValue() - Method in interface org.apache.crunch.PObject
-
Gets the value associated with this PObject
.
- getValue() - Method in class org.apache.crunch.types.writable.UnionWritable
-
- getValue() - Method in class org.apache.crunch.Union
-
Returns the underlying object value of the record.
- getValue() - Method in class org.apache.hadoop.mapred.SparkCounter
-
- getValueClass() - Method in interface org.apache.crunch.types.Converter
-
- getValues() - Method in class org.apache.crunch.TupleN
-
- getValueType() - Method in class org.apache.crunch.impl.dist.collect.PTableBase
-
- getValueType() - Method in interface org.apache.crunch.PTable
-
Returns the PType
of the value.
- getValueType() - Method in interface org.apache.crunch.types.PTableType
-
Returns the value type for the table.
- getWriter(Schema) - Method in class org.apache.crunch.types.avro.AvroMode
-
Creates a DatumWriter
based on the schema
.
- getWriter(Schema) - Method in interface org.apache.crunch.types.avro.ReaderWriterFactory
-
- getWriter(Schema) - Method in class org.apache.crunch.types.avro.ReflectDataFactory
-
- globalToplist(PCollection<X>) - Static method in class org.apache.crunch.lib.TopList
-
Create a list of unique items in the input collection with their count, sorted descending by their frequency.
- groupByKey() - Method in class org.apache.crunch.impl.dist.collect.PTableBase
-
- groupByKey(int) - Method in class org.apache.crunch.impl.dist.collect.PTableBase
-
- groupByKey(GroupingOptions) - Method in class org.apache.crunch.impl.dist.collect.PTableBase
-
- groupByKey() - Method in interface org.apache.crunch.lambda.LTable
-
- groupByKey(int) - Method in interface org.apache.crunch.lambda.LTable
-
- groupByKey(GroupingOptions) - Method in interface org.apache.crunch.lambda.LTable
-
- groupByKey() - Method in interface org.apache.crunch.PTable
-
Performs a grouping operation on the keys of this table.
- groupByKey(int) - Method in interface org.apache.crunch.PTable
-
Performs a grouping operation on the keys of this table, using the given
number of partitions.
- groupByKey(GroupingOptions) - Method in interface org.apache.crunch.PTable
-
Performs a grouping operation on the keys of this table, using the
additional GroupingOptions
to control how the grouping is executed.
- groupedWeightedReservoirSample(PTable<Integer, Pair<T, N>>, int[]) - Static method in class org.apache.crunch.lib.Sample
-
The most general purpose of the weighted reservoir sampling patterns that allows us to choose
a random sample of elements for each of N input groups.
- groupedWeightedReservoirSample(PTable<Integer, Pair<T, N>>, int[], Long) - Static method in class org.apache.crunch.lib.Sample
-
Same as the other groupedWeightedReservoirSample method, but include a seed for testing
purposes.
- groupingComparatorClass(Class<? extends RawComparator>) - Method in class org.apache.crunch.GroupingOptions.Builder
-
- GroupingOptions - Class in org.apache.crunch
-
Options that can be passed to a groupByKey
operation in order to
exercise finer control over how the partitioning, grouping, and sorting of
keys is performed.
- GroupingOptions.Builder - Class in org.apache.crunch
-
Builder class for creating GroupingOptions
instances.
- GuavaUtils - Class in org.apache.crunch.impl.spark
-
- GuavaUtils() - Constructor for class org.apache.crunch.impl.spark.GuavaUtils
-
- gzip(T) - Static method in class org.apache.crunch.io.Compress
-
Configure the given output target to be compressed using Gzip.
- main(String[]) - Static method in class org.apache.crunch.examples.AverageBytesByIP
-
- main(String[]) - Static method in class org.apache.crunch.examples.SecondarySortExample
-
- main(String[]) - Static method in class org.apache.crunch.examples.SortExample
-
- main(String[]) - Static method in class org.apache.crunch.examples.TotalBytesByIP
-
- main(String[]) - Static method in class org.apache.crunch.examples.TotalWordCount
-
- main(String[]) - Static method in class org.apache.crunch.examples.WordAggregationHBase
-
- main(String[]) - Static method in class org.apache.crunch.examples.WordCount
-
- makeTuple(Object...) - Method in class org.apache.crunch.types.TupleFactory
-
- map(R) - Method in class org.apache.crunch.fn.CompositeMapFn
-
- map(V) - Method in class org.apache.crunch.fn.ExtractKeyFn
-
- map(T) - Method in class org.apache.crunch.fn.IdentityFn
-
- map(Pair<K, V>) - Method in class org.apache.crunch.fn.PairMapFn
-
- map(T) - Method in class org.apache.crunch.fn.SDoubleFunction
-
- map(T) - Method in class org.apache.crunch.fn.SFunction
-
- map(Pair<K, V>) - Method in class org.apache.crunch.fn.SFunction2
-
- map(T) - Method in class org.apache.crunch.fn.SPairFunction
-
- map(Pair<V1, V2>) - Method in class org.apache.crunch.fn.SwapFn
-
- map(SFunction<S, T>, PType<T>) - Method in interface org.apache.crunch.lambda.LCollection
-
Map the elements of this collection 1-1 through the supplied function.
- map(SFunction<S, Pair<K, V>>, PTableType<K, V>) - Method in interface org.apache.crunch.lambda.LCollection
-
Map the elements of this collection 1-1 through the supplied function to yield an
LTable
- map(PTable<K1, V1>, Class<? extends Mapper<K1, V1, K2, V2>>, Class<K2>, Class<V2>) - Static method in class org.apache.crunch.lib.Mapred
-
- map(PTable<K1, V1>, Class<? extends Mapper<K1, V1, K2, V2>>, Class<K2>, Class<V2>) - Static method in class org.apache.crunch.lib.Mapreduce
-
- map(V) - Method in class org.apache.crunch.lib.sort.SortFns.AvroGenericFn
-
- map(V) - Method in class org.apache.crunch.lib.sort.SortFns.SingleKeyFn
-
- map(V) - Method in class org.apache.crunch.lib.sort.SortFns.TupleKeyFn
-
- map(S) - Method in class org.apache.crunch.MapFn
-
Maps the given input into an instance of the output type.
- map(Pair<Object, Iterable<Object>>) - Method in class org.apache.crunch.types.PGroupedTableType.PairIterableMapFn
-
- MapDeepCopier<T> - Class in org.apache.crunch.types
-
- MapDeepCopier(PType<T>) - Constructor for class org.apache.crunch.types.MapDeepCopier
-
- MapFn<S,T> - Class in org.apache.crunch
-
A
DoFn
for the common case of emitting exactly one value for each
input record.
- MapFn() - Constructor for class org.apache.crunch.MapFn
-
- MapFunction - Class in org.apache.crunch.impl.spark.fn
-
- MapFunction(MapFn, SparkRuntimeContext) - Constructor for class org.apache.crunch.impl.spark.fn.MapFunction
-
- mapKeys(MapFn<K, K2>, PType<K2>) - Method in class org.apache.crunch.impl.dist.collect.PTableBase
-
- mapKeys(String, MapFn<K, K2>, PType<K2>) - Method in class org.apache.crunch.impl.dist.collect.PTableBase
-
- mapKeys(SFunction<K, T>, PType<T>) - Method in interface org.apache.crunch.lambda.LTable
-
Transform the keys of this table using the given function
- mapKeys(PTable<K1, V>, MapFn<K1, K2>, PType<K2>) - Static method in class org.apache.crunch.lib.PTables
-
Maps a PTable<K1, V>
to a PTable<K2, V>
using the given MapFn<K1, K2>
on
the keys of the PTable
.
- mapKeys(String, PTable<K1, V>, MapFn<K1, K2>, PType<K2>) - Static method in class org.apache.crunch.lib.PTables
-
Maps a PTable<K1, V>
to a PTable<K2, V>
using the given MapFn<K1, K2>
on
the keys of the PTable
.
- mapKeys(MapFn<K, K2>, PType<K2>) - Method in interface org.apache.crunch.PTable
-
Returns a PTable
that has the same values as this instance, but
uses the given function to map the keys.
- mapKeys(String, MapFn<K, K2>, PType<K2>) - Method in interface org.apache.crunch.PTable
-
Returns a PTable
that has the same values as this instance, but
uses the given function to map the keys.
- MapOutputFunction<K,V> - Class in org.apache.crunch.impl.spark.fn
-
- MapOutputFunction(SerDe, SerDe) - Constructor for class org.apache.crunch.impl.spark.fn.MapOutputFunction
-
- Mapred - Class in org.apache.crunch.lib
-
Static functions for working with legacy Mappers and Reducers that live under the org.apache.hadoop.mapred.*
package as part of Crunch pipelines.
- Mapred() - Constructor for class org.apache.crunch.lib.Mapred
-
- Mapreduce - Class in org.apache.crunch.lib
-
Static functions for working with legacy Mappers and Reducers that live under the org.apache.hadoop.mapreduce.*
package as part of Crunch pipelines.
- Mapreduce() - Constructor for class org.apache.crunch.lib.Mapreduce
-
- MapReduceTarget - Interface in org.apache.crunch.io
-
- maps(PType<T>) - Static method in class org.apache.crunch.types.avro.Avros
-
- maps(PType<T>) - Method in class org.apache.crunch.types.avro.AvroTypeFamily
-
- maps(PType<T>) - Method in interface org.apache.crunch.types.PTypeFamily
-
- maps(PType<T>) - Static method in class org.apache.crunch.types.writable.Writables
-
- maps(PType<T>) - Method in class org.apache.crunch.types.writable.WritableTypeFamily
-
- MapsideJoinStrategy<K,U,V> - Class in org.apache.crunch.lib.join
-
Utility for doing map side joins on a common key between two
PTable
s.
- MapsideJoinStrategy() - Constructor for class org.apache.crunch.lib.join.MapsideJoinStrategy
-
- MapsideJoinStrategy(boolean) - Constructor for class org.apache.crunch.lib.join.MapsideJoinStrategy
-
- mapValues(MapFn<Iterable<V>, U>, PType<U>) - Method in class org.apache.crunch.impl.dist.collect.BaseGroupedTable
-
- mapValues(String, MapFn<Iterable<V>, U>, PType<U>) - Method in class org.apache.crunch.impl.dist.collect.BaseGroupedTable
-
- mapValues(MapFn<V, U>, PType<U>) - Method in class org.apache.crunch.impl.dist.collect.PTableBase
-
- mapValues(String, MapFn<V, U>, PType<U>) - Method in class org.apache.crunch.impl.dist.collect.PTableBase
-
- mapValues(SFunction<Stream<V>, T>, PType<T>) - Method in interface org.apache.crunch.lambda.LGroupedTable
-
Map the values in this LGroupedTable using a custom function.
- mapValues(SFunction<V, T>, PType<T>) - Method in interface org.apache.crunch.lambda.LTable
-
Transform the values of this table using the given function
- mapValues(PTable<K, U>, MapFn<U, V>, PType<V>) - Static method in class org.apache.crunch.lib.PTables
-
Maps a PTable<K, U>
to a PTable<K, V>
using the given MapFn<U, V>
on
the values of the PTable
.
- mapValues(String, PTable<K, U>, MapFn<U, V>, PType<V>) - Static method in class org.apache.crunch.lib.PTables
-
Maps a PTable<K, U>
to a PTable<K, V>
using the given MapFn<U, V>
on
the values of the PTable
.
- mapValues(PGroupedTable<K, U>, MapFn<Iterable<U>, V>, PType<V>) - Static method in class org.apache.crunch.lib.PTables
-
An analogue of the mapValues
function for PGroupedTable<K, U>
collections.
- mapValues(String, PGroupedTable<K, U>, MapFn<Iterable<U>, V>, PType<V>) - Static method in class org.apache.crunch.lib.PTables
-
An analogue of the mapValues
function for PGroupedTable<K, U>
collections.
- mapValues(MapFn<Iterable<V>, U>, PType<U>) - Method in interface org.apache.crunch.PGroupedTable
-
Maps the Iterable<V>
elements of each record to a new type.
- mapValues(String, MapFn<Iterable<V>, U>, PType<U>) - Method in interface org.apache.crunch.PGroupedTable
-
Maps the Iterable<V>
elements of each record to a new type.
- mapValues(MapFn<V, U>, PType<U>) - Method in interface org.apache.crunch.PTable
-
Returns a PTable
that has the same keys as this instance, but
uses the given function to map the values.
- mapValues(String, MapFn<V, U>, PType<U>) - Method in interface org.apache.crunch.PTable
-
Returns a PTable
that has the same keys as this instance, but
uses the given function to map the values.
- markLogged() - Method in exception org.apache.crunch.CrunchRuntimeException
-
Indicate that this exception has been written to the debug logs.
- materialize() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
-
- materialize(PCollection<T>) - Method in class org.apache.crunch.impl.mem.MemPipeline
-
- materialize(PCollection<T>) - Method in class org.apache.crunch.impl.mr.MRPipeline
-
- materialize(PCollection<T>) - Method in class org.apache.crunch.impl.spark.SparkPipeline
-
- materialize() - Method in interface org.apache.crunch.lambda.LCollection
-
Obtain the contents of this LCollection as a
Stream
that can be processed locally.
- materialize() - Method in interface org.apache.crunch.PCollection
-
Returns a reference to the data set represented by this PCollection that
may be used by the client to read the data locally.
- materialize(PCollection<T>) - Method in interface org.apache.crunch.Pipeline
-
Create the given PCollection and read the data it contains into the
returned Collection instance for client use.
- materialize(PCollection<T>) - Method in class org.apache.crunch.util.CrunchTool
-
- materializeAt(SourceTarget<S>) - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
-
- materializeToMap() - Method in class org.apache.crunch.impl.dist.collect.PTableBase
-
Returns a Map made up of the keys and values in this PTable.
- materializeToMap() - Method in interface org.apache.crunch.PTable
-
Returns a Map made up of the keys and values in this PTable.
- max() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
-
- max(PCollection<S>) - Static method in class org.apache.crunch.lib.Aggregate
-
Returns the largest numerical element from the input collection.
- max() - Method in interface org.apache.crunch.PCollection
-
Returns a PObject
of the maximum element of this instance.
- MAX_BIGDECIMALS() - Static method in class org.apache.crunch.fn.Aggregators
-
Return the maximum of all given
BigDecimal
values.
- MAX_BIGDECIMALS(int) - Static method in class org.apache.crunch.fn.Aggregators
-
Return the
n
largest
BigDecimal
values (or fewer if there are fewer
values than
n
).
- MAX_BIGINTS() - Static method in class org.apache.crunch.fn.Aggregators
-
Return the maximum of all given
BigInteger
values.
- MAX_BIGINTS(int) - Static method in class org.apache.crunch.fn.Aggregators
-
Return the
n
largest
BigInteger
values (or fewer if there are fewer
values than
n
).
- MAX_COMPARABLES() - Static method in class org.apache.crunch.fn.Aggregators
-
Return the maximum of all given
Comparable
values.
- MAX_DOUBLES() - Static method in class org.apache.crunch.fn.Aggregators
-
Return the maximum of all given double
values.
- MAX_DOUBLES(int) - Static method in class org.apache.crunch.fn.Aggregators
-
Return the n
largest double
values (or fewer if there are fewer
values than n
).
- MAX_FLOATS() - Static method in class org.apache.crunch.fn.Aggregators
-
Return the maximum of all given float
values.
- MAX_FLOATS(int) - Static method in class org.apache.crunch.fn.Aggregators
-
Return the n
largest float
values (or fewer if there are fewer
values than n
).
- MAX_INTS() - Static method in class org.apache.crunch.fn.Aggregators
-
Return the maximum of all given int
values.
- MAX_INTS(int) - Static method in class org.apache.crunch.fn.Aggregators
-
Return the n
largest int
values (or fewer if there are fewer
values than n
).
- MAX_LONGS() - Static method in class org.apache.crunch.fn.Aggregators
-
Return the maximum of all given long
values.
- MAX_LONGS(int) - Static method in class org.apache.crunch.fn.Aggregators
-
Return the n
largest long
values (or fewer if there are fewer
values than n
).
- MAX_N(int, Class<V>) - Static method in class org.apache.crunch.fn.Aggregators
-
Return the n
largest values (or fewer if there are fewer
values than n
).
- MAX_REDUCERS - Static variable in class org.apache.crunch.util.PartitionUtils
-
Set an upper limit on the number of reducers the Crunch planner will set for an MR
job when it tries to determine how many reducers to use based on the input size.
- MAX_UNIQUE_N(int, Class<V>) - Static method in class org.apache.crunch.fn.Aggregators
-
Return the n
largest unique values (or fewer if there are fewer
values than n
).
- meanValue(PTable<K, V>) - Static method in class org.apache.crunch.lib.Average
-
Calculate the mean average value by key for a table with numeric values.
- MemPipeline - Class in org.apache.crunch.impl.mem
-
- min() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
-
- min(PCollection<S>) - Static method in class org.apache.crunch.lib.Aggregate
-
Returns the smallest numerical element from the input collection.
- min() - Method in interface org.apache.crunch.PCollection
-
Returns a PObject
of the minimum element of this instance.
- MIN_BIGDECIMALS() - Static method in class org.apache.crunch.fn.Aggregators
-
Return the minimum of all given
BigDecimal
values.
- MIN_BIGDECIMALS(int) - Static method in class org.apache.crunch.fn.Aggregators
-
Return the
n
smallest
BigDecimal
values (or fewer if there are fewer
values than
n
).
- MIN_BIGINTS() - Static method in class org.apache.crunch.fn.Aggregators
-
Return the minimum of all given
BigInteger
values.
- MIN_BIGINTS(int) - Static method in class org.apache.crunch.fn.Aggregators
-
Return the
n
smallest
BigInteger
values (or fewer if there are fewer
values than
n
).
- MIN_COMPARABLES() - Static method in class org.apache.crunch.fn.Aggregators
-
Return the minimum of all given
Comparable
values.
- MIN_DOUBLES() - Static method in class org.apache.crunch.fn.Aggregators
-
Return the minimum of all given double
values.
- MIN_DOUBLES(int) - Static method in class org.apache.crunch.fn.Aggregators
-
Return the n
smallest double
values (or fewer if there are fewer
values than n
).
- MIN_FLOATS() - Static method in class org.apache.crunch.fn.Aggregators
-
Return the minimum of all given float
values.
- MIN_FLOATS(int) - Static method in class org.apache.crunch.fn.Aggregators
-
Return the n
smallest float
values (or fewer if there are fewer
values than n
).
- MIN_INTS() - Static method in class org.apache.crunch.fn.Aggregators
-
Return the minimum of all given int
values.
- MIN_INTS(int) - Static method in class org.apache.crunch.fn.Aggregators
-
Return the n
smallest int
values (or fewer if there are fewer
values than n
).
- MIN_LONGS() - Static method in class org.apache.crunch.fn.Aggregators
-
Return the minimum of all given long
values.
- MIN_LONGS(int) - Static method in class org.apache.crunch.fn.Aggregators
-
Return the n
smallest long
values (or fewer if there are fewer
values than n
).
- MIN_N(int, Class<V>) - Static method in class org.apache.crunch.fn.Aggregators
-
Return the n
smallest values (or fewer if there are fewer
values than n
).
- MIN_UNIQUE_N(int, Class<V>) - Static method in class org.apache.crunch.fn.Aggregators
-
Returns the n
smallest unique values (or fewer if there are fewer unique values than n
).
- MRCollection - Interface in org.apache.crunch.impl.dist.collect
-
- MRJob - Interface in org.apache.crunch.impl.mr
-
A Hadoop MapReduce job managed by Crunch.
- MRJob.State - Enum in org.apache.crunch.impl.mr
-
A job will be in one of the following states.
- MRPipeline - Class in org.apache.crunch.impl.mr
-
Pipeline implementation that is executed within Hadoop MapReduce.
- MRPipeline(Class<?>) - Constructor for class org.apache.crunch.impl.mr.MRPipeline
-
Instantiate with a default Configuration and name.
- MRPipeline(Class<?>, String) - Constructor for class org.apache.crunch.impl.mr.MRPipeline
-
Instantiate with a custom pipeline name.
- MRPipeline(Class<?>, Configuration) - Constructor for class org.apache.crunch.impl.mr.MRPipeline
-
Instantiate with a custom configuration and default naming.
- MRPipeline(Class<?>, String, Configuration) - Constructor for class org.apache.crunch.impl.mr.MRPipeline
-
Instantiate with a custom name and configuration.
- MRPipelineExecution - Interface in org.apache.crunch.impl.mr
-
- of(T, U) - Static method in class org.apache.crunch.Pair
-
- of(A, B, C) - Static method in class org.apache.crunch.Tuple3
-
- of(A, B, C, D) - Static method in class org.apache.crunch.Tuple4
-
- of(Object...) - Static method in class org.apache.crunch.TupleN
-
- OneToManyJoin - Class in org.apache.crunch.lib.join
-
Optimized join for situations where exactly one value is being joined with
any other number of values based on a common key.
- OneToManyJoin() - Constructor for class org.apache.crunch.lib.join.OneToManyJoin
-
- oneToManyJoin(PTable<K, U>, PTable<K, V>, DoFn<Pair<U, Iterable<V>>, T>, PType<T>) - Static method in class org.apache.crunch.lib.join.OneToManyJoin
-
Performs a join on two tables, where the left table only contains a single
value per key.
- oneToManyJoin(PTable<K, U>, PTable<K, V>, DoFn<Pair<U, Iterable<V>>, T>, PType<T>, int) - Static method in class org.apache.crunch.lib.join.OneToManyJoin
-
Supports a user-specified number of reducers for the one-to-many join.
- or(FilterFn<S>, FilterFn<S>) - Static method in class org.apache.crunch.fn.FilterFns
-
Accept an entry if at least one of the given filters accept it, using short-circuit evaluation.
- or(FilterFn<S>...) - Static method in class org.apache.crunch.fn.FilterFns
-
Accept an entry if at least one of the given filters accept it, using short-circuit evaluation.
- Orcs - Class in org.apache.crunch.types.orc
-
Utilities to create PTypes for ORC serialization / deserialization
- Orcs() - Constructor for class org.apache.crunch.types.orc.Orcs
-
- orcs(TypeInfo) - Static method in class org.apache.crunch.types.orc.Orcs
-
Create a PType to directly use OrcStruct as the deserialized format.
- OrcUtils - Class in org.apache.crunch.types.orc
-
- OrcUtils() - Constructor for class org.apache.crunch.types.orc.OrcUtils
-
- order() - Method in class org.apache.crunch.lib.Sort.ColumnOrder
-
- org.apache.crunch - package org.apache.crunch
-
Client-facing API and core abstractions.
- org.apache.crunch.contrib - package org.apache.crunch.contrib
-
User contributions that may be interesting for special applications.
- org.apache.crunch.contrib.bloomfilter - package org.apache.crunch.contrib.bloomfilter
-
Support for creating Bloom Filters.
- org.apache.crunch.contrib.io.jdbc - package org.apache.crunch.contrib.io.jdbc
-
Support for reading data from RDBMS using JDBC
- org.apache.crunch.contrib.text - package org.apache.crunch.contrib.text
-
- org.apache.crunch.examples - package org.apache.crunch.examples
-
Example applications demonstrating various aspects of Crunch.
- org.apache.crunch.fn - package org.apache.crunch.fn
-
Commonly used functions for manipulating collections.
- org.apache.crunch.impl - package org.apache.crunch.impl
-
- org.apache.crunch.impl.dist - package org.apache.crunch.impl.dist
-
- org.apache.crunch.impl.dist.collect - package org.apache.crunch.impl.dist.collect
-
- org.apache.crunch.impl.mem - package org.apache.crunch.impl.mem
-
In-memory Pipeline implementation for rapid prototyping and testing.
- org.apache.crunch.impl.mr - package org.apache.crunch.impl.mr
-
A Pipeline implementation that runs on Hadoop MapReduce.
- org.apache.crunch.impl.spark - package org.apache.crunch.impl.spark
-
- org.apache.crunch.impl.spark.collect - package org.apache.crunch.impl.spark.collect
-
- org.apache.crunch.impl.spark.fn - package org.apache.crunch.impl.spark.fn
-
- org.apache.crunch.impl.spark.serde - package org.apache.crunch.impl.spark.serde
-
- org.apache.crunch.io - package org.apache.crunch.io
-
Data input and output for Pipelines.
- org.apache.crunch.lambda - package org.apache.crunch.lambda
-
Alternative Crunch API using Java 8 features to allow construction of pipelines using lambda functions and method
references.
- org.apache.crunch.lambda.fn - package org.apache.crunch.lambda.fn
-
Serializable versions of the functional interfaces that ship with Java 8
- org.apache.crunch.lib - package org.apache.crunch.lib
-
Joining, sorting, aggregating, and other commonly used functionality.
- org.apache.crunch.lib.join - package org.apache.crunch.lib.join
-
Inner and outer joins on collections.
- org.apache.crunch.lib.sort - package org.apache.crunch.lib.sort
-
- org.apache.crunch.test - package org.apache.crunch.test
-
Utilities for testing Crunch-based applications.
- org.apache.crunch.types - package org.apache.crunch.types
-
Common functionality for business object serialization.
- org.apache.crunch.types.avro - package org.apache.crunch.types.avro
-
Business object serialization using Apache Avro.
- org.apache.crunch.types.orc - package org.apache.crunch.types.orc
-
- org.apache.crunch.types.writable - package org.apache.crunch.types.writable
-
Business object serialization using Hadoop's Writables framework.
- org.apache.crunch.util - package org.apache.crunch.util
-
An assorted set of utilities.
- org.apache.hadoop.mapred - package org.apache.hadoop.mapred
-
- outputConf(String, String) - Method in interface org.apache.crunch.Target
-
Adds the given key-value pair to the Configuration
instance that is used to write
this Target
.
- OutputConfig(FormatBundle<OutputFormat<K, V>>, Class<K>, Class<V>) - Constructor for class org.apache.crunch.io.CrunchOutputs.OutputConfig
-
- OutputConverterFunction<K,V,S> - Class in org.apache.crunch.impl.spark.fn
-
- OutputConverterFunction(Converter<K, V, S, ?>) - Constructor for class org.apache.crunch.impl.spark.fn.OutputConverterFunction
-
- OutputHandler - Interface in org.apache.crunch.io
-
- outputKey(S) - Method in interface org.apache.crunch.types.Converter
-
- outputValue(S) - Method in interface org.apache.crunch.types.Converter
-
- override(ReaderWriterFactory) - Method in class org.apache.crunch.types.avro.AvroMode
-
- overridePathProperties(Configuration) - Method in class org.apache.crunch.test.TemporaryPath
-
Set all keys specified in the constructor to temporary directories.
- Pair<K,V> - Class in org.apache.crunch
-
A convenience class for two-element
Tuple
s.
- Pair(K, V) - Constructor for class org.apache.crunch.Pair
-
- PAIR - Static variable in class org.apache.crunch.types.TupleFactory
-
- pair2tupleFunc() - Static method in class org.apache.crunch.impl.spark.GuavaUtils
-
- pairAggregator(Aggregator<V1>, Aggregator<V2>) - Static method in class org.apache.crunch.fn.Aggregators
-
Apply separate aggregators to each component of a
Pair
.
- PairFlatMapDoFn<T,K,V> - Class in org.apache.crunch.impl.spark.fn
-
- PairFlatMapDoFn(DoFn<T, Pair<K, V>>, SparkRuntimeContext) - Constructor for class org.apache.crunch.impl.spark.fn.PairFlatMapDoFn
-
- PairIterable(Iterable<S>, Iterable<T>) - Constructor for class org.apache.crunch.util.Tuples.PairIterable
-
- PairIterableMapFn(MapFn<Object, K>, MapFn<Object, V>) - Constructor for class org.apache.crunch.types.PGroupedTableType.PairIterableMapFn
-
- PairMapFn<K,V,S,T> - Class in org.apache.crunch.fn
-
- PairMapFn(MapFn<K, S>, MapFn<V, T>) - Constructor for class org.apache.crunch.fn.PairMapFn
-
- PairMapFunction<K,V,S> - Class in org.apache.crunch.impl.spark.fn
-
- PairMapFunction(MapFn<Pair<K, V>, S>, SparkRuntimeContext) - Constructor for class org.apache.crunch.impl.spark.fn.PairMapFunction
-
- PairMapIterableFunction<K,V,S,T> - Class in org.apache.crunch.impl.spark.fn
-
- PairMapIterableFunction(MapFn<Pair<K, List<V>>, Pair<S, Iterable<T>>>, SparkRuntimeContext) - Constructor for class org.apache.crunch.impl.spark.fn.PairMapIterableFunction
-
- pairs(PType<V1>, PType<V2>) - Static method in class org.apache.crunch.types.avro.Avros
-
- pairs(PType<V1>, PType<V2>) - Method in class org.apache.crunch.types.avro.AvroTypeFamily
-
- pairs(PType<V1>, PType<V2>) - Method in interface org.apache.crunch.types.PTypeFamily
-
- pairs(PType<V1>, PType<V2>) - Static method in class org.apache.crunch.types.writable.Writables
-
- pairs(PType<V1>, PType<V2>) - Method in class org.apache.crunch.types.writable.WritableTypeFamily
-
- PairValueComparator(boolean) - Constructor for class org.apache.crunch.lib.Aggregate.PairValueComparator
-
- parallelDo(DoFn<S, T>, PType<T>) - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
-
- parallelDo(String, DoFn<S, T>, PType<T>) - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
-
- parallelDo(String, DoFn<S, T>, PType<T>, ParallelDoOptions) - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
-
- parallelDo(DoFn<S, Pair<K, V>>, PTableType<K, V>) - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
-
- parallelDo(String, DoFn<S, Pair<K, V>>, PTableType<K, V>) - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
-
- parallelDo(String, DoFn<S, Pair<K, V>>, PTableType<K, V>, ParallelDoOptions) - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
-
- parallelDo(DoFn<S, T>, PType<T>) - Method in interface org.apache.crunch.lambda.LCollection
-
Transform this LCollection using a standard Crunch
DoFn
- parallelDo(DoFn<S, Pair<K, V>>, PTableType<K, V>) - Method in interface org.apache.crunch.lambda.LCollection
-
Transform this LCollection to an
LTable
using a standard Crunch
DoFn
- parallelDo(LDoFn<S, T>, PType<T>) - Method in interface org.apache.crunch.lambda.LCollection
-
Transform this LCollection using a Lambda-friendly
LDoFn
.
- parallelDo(LDoFn<S, Pair<K, V>>, PTableType<K, V>) - Method in interface org.apache.crunch.lambda.LCollection
-
Transform this LCollection using a Lambda-friendly
LDoFn
.
- parallelDo(DoFn<S, T>, PType<T>) - Method in interface org.apache.crunch.PCollection
-
Applies the given doFn to the elements of this PCollection
and
returns a new PCollection
that is the output of this processing.
- parallelDo(String, DoFn<S, T>, PType<T>) - Method in interface org.apache.crunch.PCollection
-
Applies the given doFn to the elements of this PCollection
and
returns a new PCollection
that is the output of this processing.
- parallelDo(String, DoFn<S, T>, PType<T>, ParallelDoOptions) - Method in interface org.apache.crunch.PCollection
-
Applies the given doFn to the elements of this PCollection
and
returns a new PCollection
that is the output of this processing.
- parallelDo(DoFn<S, Pair<K, V>>, PTableType<K, V>) - Method in interface org.apache.crunch.PCollection
-
Similar to the other parallelDo
instance, but returns a
PTable
instance instead of a PCollection
.
- parallelDo(String, DoFn<S, Pair<K, V>>, PTableType<K, V>) - Method in interface org.apache.crunch.PCollection
-
Similar to the other parallelDo
instance, but returns a
PTable
instance instead of a PCollection
.
- parallelDo(String, DoFn<S, Pair<K, V>>, PTableType<K, V>, ParallelDoOptions) - Method in interface org.apache.crunch.PCollection
-
Similar to the other parallelDo
instance, but returns a
PTable
instance instead of a PCollection
.
- ParallelDoOptions - Class in org.apache.crunch
-
Container class that includes optional information about a parallelDo
operation
applied to a PCollection
.
- ParallelDoOptions.Builder - Class in org.apache.crunch
-
- parallelism(int) - Static method in class org.apache.crunch.CreateOptions
-
- Parse - Class in org.apache.crunch.contrib.text
-
Methods for parsing instances of PCollection<String>
into PCollection
's of strongly-typed
tuples.
- parse(String, PCollection<String>, Extractor<T>) - Static method in class org.apache.crunch.contrib.text.Parse
-
Parses the lines of the input PCollection<String>
and returns a PCollection<T>
using
the given Extractor<T>
.
- parse(String, PCollection<String>, PTypeFamily, Extractor<T>) - Static method in class org.apache.crunch.contrib.text.Parse
-
Parses the lines of the input PCollection<String>
and returns a PCollection<T>
using
the given Extractor<T>
that uses the given PTypeFamily
.
- parseTable(String, PCollection<String>, Extractor<Pair<K, V>>) - Static method in class org.apache.crunch.contrib.text.Parse
-
Parses the lines of the input PCollection<String>
and returns a PTable<K, V>
using
the given Extractor<Pair<K, V>>
.
- parseTable(String, PCollection<String>, PTypeFamily, Extractor<Pair<K, V>>) - Static method in class org.apache.crunch.contrib.text.Parse
-
Parses the lines of the input PCollection<String>
and returns a PTable<K, V>
using
the given Extractor<Pair<K, V>>
that uses the given PTypeFamily
.
- partition - Variable in class org.apache.crunch.impl.spark.IntByteArray
-
- PartitionedMapOutputFunction<K,V> - Class in org.apache.crunch.impl.spark.fn
-
- PartitionedMapOutputFunction(SerDe<K>, SerDe<V>, PGroupedTableType<K, V>, int, GroupingOptions, SparkRuntimeContext) - Constructor for class org.apache.crunch.impl.spark.fn.PartitionedMapOutputFunction
-
- PARTITIONER_PATH - Static variable in class org.apache.crunch.lib.sort.TotalOrderPartitioner
-
- partitionerClass(Class<? extends Partitioner>) - Method in class org.apache.crunch.GroupingOptions.Builder
-
- PartitionUtils - Class in org.apache.crunch.util
-
Helper functions and settings for determining the number of reducers to use in a pipeline
job created by the Crunch planner.
- PartitionUtils() - Constructor for class org.apache.crunch.util.PartitionUtils
-
- PathTarget - Interface in org.apache.crunch.io
-
A target whose output goes to a given path on a file system.
- PCollection<S> - Interface in org.apache.crunch
-
A representation of an immutable, distributed collection of elements that is
the fundamental target of computations in Crunch.
- PCollectionFactory - Interface in org.apache.crunch.impl.dist.collect
-
- PCollectionImpl<S> - Class in org.apache.crunch.impl.dist.collect
-
- PCollectionImpl(String, DistributedPipeline) - Constructor for class org.apache.crunch.impl.dist.collect.PCollectionImpl
-
- PCollectionImpl(String, DistributedPipeline, ParallelDoOptions) - Constructor for class org.apache.crunch.impl.dist.collect.PCollectionImpl
-
- PCollectionImpl.Visitor - Interface in org.apache.crunch.impl.dist.collect
-
- Person - Class in org.apache.crunch.test
-
- Person() - Constructor for class org.apache.crunch.test.Person
-
Default constructor.
- Person(CharSequence, Integer, List<CharSequence>) - Constructor for class org.apache.crunch.test.Person
-
All-args constructor.
- Person.Builder - Class in org.apache.crunch.test
-
RecordBuilder for Person instances.
- PGroupedTable<K,V> - Interface in org.apache.crunch
-
The Crunch representation of a grouped
PTable
, which corresponds to the output of
the shuffle phase of a MapReduce job.
- PGroupedTableImpl<K,V> - Class in org.apache.crunch.impl.spark.collect
-
- PGroupedTableType<K,V> - Class in org.apache.crunch.types
-
- PGroupedTableType(PTableType<K, V>) - Constructor for class org.apache.crunch.types.PGroupedTableType
-
- PGroupedTableType.PairIterableMapFn<K,V> - Class in org.apache.crunch.types
-
- Pipeline - Interface in org.apache.crunch
-
Manages the state of a pipeline execution.
- PipelineCallable<Output> - Class in org.apache.crunch
-
A specialization of Callable
that executes some sequential logic on the client machine as
part of an overall Crunch pipeline in order to generate zero or more outputs, some of
which may be PCollection
instances that are processed by other jobs in the
pipeline.
- PipelineCallable() - Constructor for class org.apache.crunch.PipelineCallable
-
- PipelineCallable.Status - Enum in org.apache.crunch
-
- PipelineExecution - Interface in org.apache.crunch
-
A handle to allow clients to control a Crunch pipeline as it runs.
- PipelineExecution.Status - Enum in org.apache.crunch
-
- PipelineResult - Class in org.apache.crunch
-
Container for the results of a call to run
or done
on the
Pipeline interface that includes details and statistics about the component
stages of the data pipeline.
- PipelineResult(List<PipelineResult.StageResult>, PipelineExecution.Status) - Constructor for class org.apache.crunch.PipelineResult
-
- PipelineResult.StageResult - Class in org.apache.crunch
-
- plan() - Method in class org.apache.crunch.impl.mr.MRPipeline
-
- PObject<T> - Interface in org.apache.crunch
-
A PObject
represents a singleton object value that results from a distributed
computation.
- process(S, Emitter<Pair<String, BloomFilter>>) - Method in class org.apache.crunch.contrib.bloomfilter.BloomFilterFn
-
- process(S, Emitter<T>) - Method in class org.apache.crunch.DoFn
-
- process(T, Emitter<T>) - Method in class org.apache.crunch.FilterFn
-
- process(T, Emitter<Double>) - Method in class org.apache.crunch.fn.SDoubleFlatMapFunction
-
- process(T, Emitter<R>) - Method in class org.apache.crunch.fn.SFlatMapFunction
-
- process(Pair<K, V>, Emitter<R>) - Method in class org.apache.crunch.fn.SFlatMapFunction2
-
- process(T, Emitter<Pair<K, V>>) - Method in class org.apache.crunch.fn.SPairFlatMapFunction
-
- process(LDoFnContext<S, T>) - Method in interface org.apache.crunch.lambda.LDoFn
-
- process(Pair<Integer, Iterable<Pair<K, V>>>, Emitter<Pair<Integer, Pair<K, V>>>) - Method in class org.apache.crunch.lib.Aggregate.TopKCombineFn
-
- process(Pair<K, V>, Emitter<Pair<Integer, Pair<K, V>>>) - Method in class org.apache.crunch.lib.Aggregate.TopKFn
-
- process(Pair<Pair<K, Integer>, Iterable<Pair<U, V>>>, Emitter<Pair<K, Pair<U, V>>>) - Method in class org.apache.crunch.lib.join.JoinFn
-
Split up the input record to make coding a bit more manageable.
- process(S, Emitter<T>) - Method in class org.apache.crunch.MapFn
-
- Protos - Class in org.apache.crunch.types
-
Utility functions for working with protocol buffers in Crunch.
- Protos() - Constructor for class org.apache.crunch.types.Protos
-
- protos(Class<T>, PTypeFamily) - Static method in class org.apache.crunch.types.PTypes
-
Constructs a PType for the given protocol buffer.
- protos(Class<T>, PTypeFamily, SerializableSupplier<ExtensionRegistry>) - Static method in class org.apache.crunch.types.PTypes
-
Constructs a PType for a protocol buffer, using the given SerializableSupplier
to provide
an ExtensionRegistry
to use in reading the given protobuf.
- PTable<K,V> - Interface in org.apache.crunch
-
A sub-interface of PCollection
that represents an immutable,
distributed multi-map of keys and values.
- PTableBase<K,V> - Class in org.apache.crunch.impl.dist.collect
-
- PTableBase(String, DistributedPipeline) - Constructor for class org.apache.crunch.impl.dist.collect.PTableBase
-
- PTableBase(String, DistributedPipeline, ParallelDoOptions) - Constructor for class org.apache.crunch.impl.dist.collect.PTableBase
-
- PTables - Class in org.apache.crunch.lib
-
Methods for performing common operations on PTables.
- PTables() - Constructor for class org.apache.crunch.lib.PTables
-
- PTableType<K,V> - Interface in org.apache.crunch.types
-
An extension of
PType
specifically for
PTable
objects.
- ptf() - Method in interface org.apache.crunch.lambda.LCollection
-
Get the
PTypeFamily
representing how elements of this collection may be serialized.
- ptype(PType<Pair<V1, V2>>) - Static method in class org.apache.crunch.fn.SwapFn
-
- pType() - Method in interface org.apache.crunch.lambda.LCollection
-
Get the
PType
representing how elements of this collection may be serialized.
- pType() - Method in interface org.apache.crunch.lambda.LTable
-
Get the underlying
PTableType
used to serialize key/value pairs in this table
- pType(PType<V>) - Static method in class org.apache.crunch.lib.Quantiles.Result
-
Create a PType for the result type, to be stored as a derived type from Crunch primitives
- PType<T> - Interface in org.apache.crunch.types
-
A PType
defines a mapping between a data type that is used in a Crunch pipeline and a
serialization and storage format that is used to read/write data from/to HDFS.
- PTypeFamily - Interface in org.apache.crunch.types
-
An abstract factory for creating PType
instances that have the same
serialization/storage backing format.
- PTypes - Class in org.apache.crunch.types
-
Utility functions for creating common types of derived PTypes, e.g., for JSON
data, protocol buffers, and Thrift records.
- PTypes() - Constructor for class org.apache.crunch.types.PTypes
-
- PTypeUtils - Class in org.apache.crunch.types
-
Utilities for converting between PType
s from different
PTypeFamily
implementations.
- put(int, Object) - Method in class org.apache.crunch.test.Employee
-
- put(int, Object) - Method in class org.apache.crunch.test.Person
-
- read(Source<S>) - Method in class org.apache.crunch.impl.dist.DistributedPipeline
-
- read(Source<S>, String) - Method in class org.apache.crunch.impl.dist.DistributedPipeline
-
- read(TableSource<K, V>) - Method in class org.apache.crunch.impl.dist.DistributedPipeline
-
- read(TableSource<K, V>, String) - Method in class org.apache.crunch.impl.dist.DistributedPipeline
-
- read(Source<T>) - Method in class org.apache.crunch.impl.mem.MemPipeline
-
- read(Source<T>, String) - Method in class org.apache.crunch.impl.mem.MemPipeline
-
- read(TableSource<K, V>) - Method in class org.apache.crunch.impl.mem.MemPipeline
-
- read(TableSource<K, V>, String) - Method in class org.apache.crunch.impl.mem.MemPipeline
-
- read(FileSystem, Path) - Method in interface org.apache.crunch.io.FileReaderFactory
-
- read(Configuration) - Method in interface org.apache.crunch.io.ReadableSource
-
Returns an Iterable
that contains the contents of this source.
- read(Source<T>) - Method in interface org.apache.crunch.Pipeline
-
Converts the given Source
into a PCollection
that is
available to jobs run using this Pipeline
instance.
- read(Source<T>, String) - Method in interface org.apache.crunch.Pipeline
-
Converts the given Source
into a PCollection
that is
available to jobs run using this Pipeline
instance.
- read(TableSource<K, V>) - Method in interface org.apache.crunch.Pipeline
-
A version of the read method for TableSource
instances that map to
PTable
s.
- read(TableSource<K, V>, String) - Method in interface org.apache.crunch.Pipeline
-
A version of the read method for TableSource
instances that map to
PTable
s.
- read(TaskInputOutputContext<?, ?, ?, ?>) - Method in interface org.apache.crunch.ReadableData
-
Read the data referenced by this instance within the given context.
- read(Source<T>) - Method in class org.apache.crunch.util.CrunchTool
-
- read(TableSource<K, V>) - Method in class org.apache.crunch.util.CrunchTool
-
- read(TaskInputOutputContext<?, ?, ?, ?>) - Method in class org.apache.crunch.util.DelegatingReadableData
-
- read(Configuration, Path) - Static method in class org.apache.crunch.util.DistCache
-
- read(TaskInputOutputContext<?, ?, ?, ?>) - Method in class org.apache.crunch.util.UnionReadableData
-
- ReadableData<T> - Interface in org.apache.crunch
-
Represents the contents of a data source that can be read on the cluster from within one
of the tasks running as part of a Crunch pipeline.
- ReadableSource<T> - Interface in org.apache.crunch.io
-
An extension of the Source
interface that indicates that a
Source
instance may be read as a series of records by the client
code.
- ReadableSourceTarget<T> - Interface in org.apache.crunch.io
-
An interface that indicates that a SourceTarget
instance can be read
into the local client.
- ReaderWriterFactory - Interface in org.apache.crunch.types.avro
-
Interface for accessing DatumReader, DatumWriter, and Data classes.
- readFields(DataInput) - Method in class org.apache.crunch.contrib.io.jdbc.IdentifiableName
-
- readFields(ResultSet) - Method in class org.apache.crunch.contrib.io.jdbc.IdentifiableName
-
- readFields(DataInput) - Method in class org.apache.crunch.io.FormatBundle
-
- readFields(DataInput) - Method in class org.apache.crunch.types.writable.TupleWritable
- readFields(DataInput) - Method in class org.apache.crunch.types.writable.UnionWritable
-
- readTextFile(String) - Method in class org.apache.crunch.impl.dist.DistributedPipeline
-
- readTextFile(String) - Method in class org.apache.crunch.impl.mem.MemPipeline
-
- readTextFile(String) - Method in interface org.apache.crunch.Pipeline
-
A convenience method for reading a text file.
- readTextFile(String) - Method in class org.apache.crunch.util.CrunchTool
-
- records(Class<T>) - Static method in class org.apache.crunch.types.avro.Avros
-
- records(Class<T>) - Method in class org.apache.crunch.types.avro.AvroTypeFamily
-
- records(Class<T>) - Method in interface org.apache.crunch.types.PTypeFamily
-
- records(Class<T>) - Static method in class org.apache.crunch.types.writable.Writables
-
- records(Class<T>) - Method in class org.apache.crunch.types.writable.WritableTypeFamily
-
- reduce(PGroupedTable<K1, V1>, Class<? extends Reducer<K1, V1, K2, V2>>, Class<K2>, Class<V2>) - Static method in class org.apache.crunch.lib.Mapred
-
- reduce(PGroupedTable<K1, V1>, Class<? extends Reducer<K1, V1, K2, V2>>, Class<K2>, Class<V2>) - Static method in class org.apache.crunch.lib.Mapreduce
-
- ReduceGroupingFunction - Class in org.apache.crunch.impl.spark.fn
-
- ReduceGroupingFunction(GroupingOptions, PGroupedTableType, SparkRuntimeContext) - Constructor for class org.apache.crunch.impl.spark.fn.ReduceGroupingFunction
-
- ReduceInputFunction<K,V> - Class in org.apache.crunch.impl.spark.fn
-
- ReduceInputFunction(SerDe<K>, SerDe<V>) - Constructor for class org.apache.crunch.impl.spark.fn.ReduceInputFunction
-
- reduceValues(SBinaryOperator<V>) - Method in interface org.apache.crunch.lambda.LGroupedTable
-
Reduce the values for each key using the an associative binary operator.
- REFLECT - Static variable in class org.apache.crunch.types.avro.AvroMode
-
Default mode to use for reading and writing Reflect
types.
- REFLECT_DATA_FACTORY - Static variable in class org.apache.crunch.types.avro.Avros
-
- REFLECT_DATA_FACTORY_CLASS - Static variable in class org.apache.crunch.types.avro.Avros
-
The name of the configuration parameter that tracks which reflection
factory to use.
- ReflectDataFactory - Class in org.apache.crunch.types.avro
-
A Factory class for constructing Avro reflection-related objects.
- ReflectDataFactory() - Constructor for class org.apache.crunch.types.avro.ReflectDataFactory
-
- reflects(Class<T>) - Static method in class org.apache.crunch.types.avro.Avros
-
- reflects(Class<T>, Schema) - Static method in class org.apache.crunch.types.avro.Avros
-
- reflects(Class<T>) - Static method in class org.apache.crunch.types.orc.Orcs
-
Create a PType which uses reflection to serialize/deserialize java POJOs
to/from ORC.
- register(Class<T>, AvroType<T>) - Static method in class org.apache.crunch.types.avro.Avros
-
- register(Class<T>, WritableType<T, ? extends Writable>) - Static method in class org.apache.crunch.types.writable.Writables
-
- registerComparable(Class<? extends WritableComparable>) - Static method in class org.apache.crunch.types.writable.Writables
-
Registers a WritableComparable
class so that it can be used for comparing the fields inside of
tuple types (e.g., pairs
, trips
, tupleN
, etc.) for use in sorts and
secondary sorts.
- registerComparable(Class<? extends WritableComparable>, int) - Static method in class org.apache.crunch.types.writable.Writables
-
Registers a WritableComparable
class with a given integer code to use for serializing
and deserializing instances of this class that are defined inside of tuple types (e.g., pairs
,
trips
, tupleN
, etc.) Unregistered Writables are always serialized to bytes and
cannot be used in comparisons (e.g., sorts and secondary sorts) according to their underlying types.
- REJECT_ALL() - Static method in class org.apache.crunch.fn.FilterFns
-
Reject everything.
- remove() - Method in class org.apache.crunch.util.DoFnIterator
-
- replicas(int) - Method in class org.apache.crunch.CachingOptions.Builder
-
- replicas() - Method in class org.apache.crunch.CachingOptions
-
Returns the number of replicas of the data that should be maintained in the cache.
- requireSortedKeys() - Method in class org.apache.crunch.GroupingOptions.Builder
-
- requireSortedKeys() - Method in class org.apache.crunch.GroupingOptions
-
- reservoirSample(PCollection<T>, int) - Static method in class org.apache.crunch.lib.Sample
-
Select a fixed number of elements from the given PCollection
with each element
equally likely to be included in the sample.
- reservoirSample(PCollection<T>, int, Long) - Static method in class org.apache.crunch.lib.Sample
-
A version of the reservoir sampling algorithm that uses a given seed, primarily for
testing purposes.
- reset() - Method in interface org.apache.crunch.Aggregator
-
Clears the internal state of this Aggregator and prepares it for the
values associated with the next key.
- reset() - Method in class org.apache.crunch.lambda.LAggregator
-
- Result(long, Iterable<Pair<Double, V>>) - Constructor for class org.apache.crunch.lib.Quantiles.Result
-
- results() - Method in interface org.apache.crunch.Aggregator
-
Returns the current aggregated state of this instance.
- results() - Method in class org.apache.crunch.lambda.LAggregator
-
- ReverseAvroComparator<T> - Class in org.apache.crunch.lib.sort
-
- ReverseAvroComparator() - Constructor for class org.apache.crunch.lib.sort.ReverseAvroComparator
-
- ReverseWritableComparator<T> - Class in org.apache.crunch.lib.sort
-
- ReverseWritableComparator() - Constructor for class org.apache.crunch.lib.sort.ReverseWritableComparator
-
- rightJoin(PTable<K, U>, PTable<K, V>) - Static method in class org.apache.crunch.lib.Join
-
Performs a right outer join on the specified
PTable
s.
- RightOuterJoinFn<K,U,V> - Class in org.apache.crunch.lib.join
-
Used to perform the last step of an right outer join.
- RightOuterJoinFn(PType<K>, PType<U>) - Constructor for class org.apache.crunch.lib.join.RightOuterJoinFn
-
- run(String[]) - Method in class org.apache.crunch.examples.AverageBytesByIP
-
- run(String[]) - Method in class org.apache.crunch.examples.SecondarySortExample
-
- run(String[]) - Method in class org.apache.crunch.examples.SortExample
-
- run(String[]) - Method in class org.apache.crunch.examples.TotalBytesByIP
-
- run(String[]) - Method in class org.apache.crunch.examples.TotalWordCount
-
- run(String[]) - Method in class org.apache.crunch.examples.WordAggregationHBase
-
- run(String[]) - Method in class org.apache.crunch.examples.WordCount
-
- run() - Method in class org.apache.crunch.impl.mem.MemPipeline
-
- run() - Method in class org.apache.crunch.impl.mr.MRPipeline
-
- run() - Method in class org.apache.crunch.impl.spark.SparkPipeline
-
- run() - Method in interface org.apache.crunch.Pipeline
-
Constructs and executes a series of MapReduce jobs in order to write data
to the output targets.
- run() - Method in class org.apache.crunch.util.CrunchTool
-
- runAsync() - Method in class org.apache.crunch.impl.mem.MemPipeline
-
- runAsync() - Method in class org.apache.crunch.impl.mr.MRPipeline
-
- runAsync() - Method in class org.apache.crunch.impl.spark.SparkPipeline
-
- runAsync() - Method in interface org.apache.crunch.Pipeline
-
Constructs and starts a series of MapReduce jobs in order ot write data to
the output targets, but returns a ListenableFuture
to allow clients to control
job execution.
- runAsync() - Method in class org.apache.crunch.util.CrunchTool
-
- runSingleThreaded() - Method in class org.apache.crunch.PipelineCallable
-
Override this method to indicate to the planner that this instance should not be run at the
same time as any other PipelineCallable
instances.
- salary - Variable in class org.apache.crunch.test.Employee
-
Deprecated.
- Sample - Class in org.apache.crunch.lib
-
Methods for performing random sampling in a distributed fashion, either by accepting each
record in a PCollection
with an independent probability in order to sample some
fraction of the overall data set, or by using reservoir sampling in order to pull a uniform
or weighted sample of fixed size from a PCollection
of an unknown size.
- Sample() - Constructor for class org.apache.crunch.lib.Sample
-
- sample(PCollection<S>, double) - Static method in class org.apache.crunch.lib.Sample
-
Output records from the given PCollection
with the given probability.
- sample(PCollection<S>, Long, double) - Static method in class org.apache.crunch.lib.Sample
-
Output records from the given PCollection
using a given seed.
- sample(PTable<K, V>, double) - Static method in class org.apache.crunch.lib.Sample
-
A PTable<K, V>
analogue of the sample
function.
- sample(PTable<K, V>, Long, double) - Static method in class org.apache.crunch.lib.Sample
-
A PTable<K, V>
analogue of the sample
function, with the seed argument
exposed for testing purposes.
- SAMPLE_UNIQUE_ELEMENTS(int) - Static method in class org.apache.crunch.fn.Aggregators
-
Collect a sample of unique elements from the input, where 'unique' is defined by
the equals
method for the input objects.
- SBiConsumer<K,V> - Interface in org.apache.crunch.lambda.fn
-
Serializable version of the Java BiConsumer functional interface.
- SBiFunction<K,V,T> - Interface in org.apache.crunch.lambda.fn
-
Serializable version of the Java BiFunction functional interface.
- SBinaryOperator<T> - Interface in org.apache.crunch.lambda.fn
-
Serializable version of the Java BinaryOperator functional interface.
- scaleFactor() - Method in class org.apache.crunch.DoFn
-
Returns an estimate of how applying this function to a
PCollection
will cause it to change in side.
- scaleFactor() - Method in class org.apache.crunch.FilterFn
-
- scaleFactor() - Method in class org.apache.crunch.fn.CompositeMapFn
-
- scaleFactor() - Method in class org.apache.crunch.fn.ExtractKeyFn
-
- scaleFactor() - Method in class org.apache.crunch.fn.PairMapFn
-
- scaleFactor() - Method in class org.apache.crunch.MapFn
-
- SCHEMA$ - Static variable in class org.apache.crunch.test.Employee
-
- SCHEMA$ - Static variable in class org.apache.crunch.test.Person
-
- SConsumer<T> - Interface in org.apache.crunch.lambda.fn
-
Serializable version of the Java Consumer functional interface.
- SDoubleFlatMapFunction<T> - Class in org.apache.crunch.fn
-
A Crunch-compatible abstract base class for Spark's DoubleFlatMapFunction
.
- SDoubleFlatMapFunction() - Constructor for class org.apache.crunch.fn.SDoubleFlatMapFunction
-
- SDoubleFunction<T> - Class in org.apache.crunch.fn
-
A Crunch-compatible abstract base class for Spark's DoubleFunction
.
- SDoubleFunction() - Constructor for class org.apache.crunch.fn.SDoubleFunction
-
- second() - Method in class org.apache.crunch.Pair
-
- second() - Method in class org.apache.crunch.Tuple3
-
- second() - Method in class org.apache.crunch.Tuple4
-
- SecondarySort - Class in org.apache.crunch.lib
-
Utilities for performing a secondary sort on a PTable<K, Pair<V1, V2>>
collection.
- SecondarySort() - Constructor for class org.apache.crunch.lib.SecondarySort
-
- SecondarySortExample - Class in org.apache.crunch.examples
-
- SecondarySortExample() - Constructor for class org.apache.crunch.examples.SecondarySortExample
-
- sequenceFile(String, Class<T>) - Static method in class org.apache.crunch.io.At
-
Creates a SourceTarget<T>
instance from the SequenceFile(s) at the given path name
from the value field of each key-value pair in the SequenceFile(s).
- sequenceFile(Path, Class<T>) - Static method in class org.apache.crunch.io.At
-
Creates a SourceTarget<T>
instance from the SequenceFile(s) at the given Path
from the value field of each key-value pair in the SequenceFile(s).
- sequenceFile(String, PType<T>) - Static method in class org.apache.crunch.io.At
-
Creates a SourceTarget<T>
instance from the SequenceFile(s) at the given path name
from the value field of each key-value pair in the SequenceFile(s).
- sequenceFile(Path, PType<T>) - Static method in class org.apache.crunch.io.At
-
Creates a SourceTarget<T>
instance from the SequenceFile(s) at the given Path
from the value field of each key-value pair in the SequenceFile(s).
- sequenceFile(String, Class<K>, Class<V>) - Static method in class org.apache.crunch.io.At
-
Creates a TableSourceTarget<K, V>
instance from the SequenceFile(s) at the given path name
from the key-value pairs in the SequenceFile(s).
- sequenceFile(Path, Class<K>, Class<V>) - Static method in class org.apache.crunch.io.At
-
Creates a TableSourceTarget<K, V>
instance from the SequenceFile(s) at the given Path
from the key-value pairs in the SequenceFile(s).
- sequenceFile(String, PType<K>, PType<V>) - Static method in class org.apache.crunch.io.At
-
Creates a TableSourceTarget<K, V>
instance from the SequenceFile(s) at the given path name
from the key-value pairs in the SequenceFile(s).
- sequenceFile(Path, PType<K>, PType<V>) - Static method in class org.apache.crunch.io.At
-
Creates a TableSourceTarget<K, V>
instance from the SequenceFile(s) at the given Path
from the key-value pairs in the SequenceFile(s).
- sequenceFile(String, Class<T>) - Static method in class org.apache.crunch.io.From
-
Creates a Source<T>
instance from the SequenceFile(s) at the given path name
from the value field of each key-value pair in the SequenceFile(s).
- sequenceFile(Path, Class<T>) - Static method in class org.apache.crunch.io.From
-
Creates a Source<T>
instance from the SequenceFile(s) at the given Path
from the value field of each key-value pair in the SequenceFile(s).
- sequenceFile(List<Path>, Class<T>) - Static method in class org.apache.crunch.io.From
-
Creates a Source<T>
instance from the SequenceFile(s) at the given Path
s
from the value field of each key-value pair in the SequenceFile(s).
- sequenceFile(String, PType<T>) - Static method in class org.apache.crunch.io.From
-
Creates a Source<T>
instance from the SequenceFile(s) at the given path name
from the value field of each key-value pair in the SequenceFile(s).
- sequenceFile(Path, PType<T>) - Static method in class org.apache.crunch.io.From
-
Creates a Source<T>
instance from the SequenceFile(s) at the given Path
from the value field of each key-value pair in the SequenceFile(s).
- sequenceFile(List<Path>, PType<T>) - Static method in class org.apache.crunch.io.From
-
Creates a Source<T>
instance from the SequenceFile(s) at the given Path
s
from the value field of each key-value pair in the SequenceFile(s).
- sequenceFile(String, Class<K>, Class<V>) - Static method in class org.apache.crunch.io.From
-
Creates a TableSource<K, V>
instance for the SequenceFile(s) at the given path name.
- sequenceFile(Path, Class<K>, Class<V>) - Static method in class org.apache.crunch.io.From
-
Creates a TableSource<K, V>
instance for the SequenceFile(s) at the given Path
.
- sequenceFile(List<Path>, Class<K>, Class<V>) - Static method in class org.apache.crunch.io.From
-
Creates a TableSource<K, V>
instance for the SequenceFile(s) at the given Path
s.
- sequenceFile(String, PType<K>, PType<V>) - Static method in class org.apache.crunch.io.From
-
Creates a TableSource<K, V>
instance for the SequenceFile(s) at the given path name.
- sequenceFile(Path, PType<K>, PType<V>) - Static method in class org.apache.crunch.io.From
-
Creates a TableSource<K, V>
instance for the SequenceFile(s) at the given Path
.
- sequenceFile(List<Path>, PType<K>, PType<V>) - Static method in class org.apache.crunch.io.From
-
Creates a TableSource<K, V>
instance for the SequenceFile(s) at the given Path
s.
- sequenceFile(String) - Static method in class org.apache.crunch.io.To
-
Creates a Target
at the given path name that writes data to
SequenceFiles.
- sequenceFile(Path) - Static method in class org.apache.crunch.io.To
-
Creates a Target
at the given Path
that writes data to
SequenceFiles.
- sequentialDo(String, PipelineCallable<Output>) - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
-
- sequentialDo(PipelineCallable<Output>) - Method in class org.apache.crunch.impl.dist.DistributedPipeline
-
- sequentialDo(PipelineCallable<Output>) - Method in class org.apache.crunch.impl.mem.MemPipeline
-
- sequentialDo(String, PipelineCallable<Output>) - Method in interface org.apache.crunch.PCollection
-
Adds the materialized data in this PCollection
as a dependency to the given
PipelineCallable
and registers it with the Pipeline
associated with this
instance.
- sequentialDo(PipelineCallable<Output>) - Method in interface org.apache.crunch.Pipeline
-
Executes the given PipelineCallable
on the client after the Targets
that the PipelineCallable depends on (if any) have been created by other pipeline
processing steps.
- SequentialFileNamingScheme - Class in org.apache.crunch.io
-
Default
FileNamingScheme
that uses an incrementing sequence number in
order to generate unique file names.
- SerDe<T> - Interface in org.apache.crunch.impl.spark.serde
-
- SerDeFactory - Class in org.apache.crunch.impl.spark.serde
-
- SerDeFactory() - Constructor for class org.apache.crunch.impl.spark.serde.SerDeFactory
-
- SerializableSupplier<T> - Interface in org.apache.crunch.util
-
An extension of Guava's
Supplier
interface that indicates that an instance
will also implement
Serializable
, which makes this object suitable for use
with Crunch's DoFns when we need to construct an instance of a non-serializable
type for use in processing.
- serialize() - Method in class org.apache.crunch.io.FormatBundle
-
- set(String, String) - Method in class org.apache.crunch.io.FormatBundle
-
- Set - Class in org.apache.crunch.lib
-
Utilities for performing set operations (difference, intersection, etc) on
PCollection
instances.
- Set() - Constructor for class org.apache.crunch.lib.Set
-
- set(int, Writable) - Method in class org.apache.crunch.types.writable.TupleWritable
-
- setAge(int) - Method in class org.apache.crunch.test.Person.Builder
-
Sets the value of the 'age' field
- setAge(Integer) - Method in class org.apache.crunch.test.Person
-
Sets the value of the 'age' field.
- setBreakpoint() - Method in class org.apache.crunch.impl.dist.collect.BaseUnionCollection
-
- setBreakpoint() - Method in class org.apache.crunch.impl.dist.collect.BaseUnionTable
-
- setBreakpoint() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
-
- setCombineFn(CombineFn) - Method in class org.apache.crunch.impl.spark.SparkRuntime
-
- setConf(Broadcast<byte[]>) - Method in class org.apache.crunch.impl.spark.SparkRuntimeContext
-
- setConf(Configuration) - Method in class org.apache.crunch.io.FormatBundle
-
- setConf(Configuration) - Method in class org.apache.crunch.lib.join.JoinUtils.AvroPairGroupingComparator
-
- setConf(Configuration) - Method in class org.apache.crunch.lib.sort.ReverseAvroComparator
-
- setConf(Configuration) - Method in class org.apache.crunch.lib.sort.ReverseWritableComparator
-
- setConf(Configuration) - Method in class org.apache.crunch.lib.sort.TotalOrderPartitioner
-
- setConf(Configuration) - Method in class org.apache.crunch.lib.sort.TupleWritableComparator
-
- setConf(Configuration) - Method in class org.apache.crunch.types.writable.TupleWritable.Comparator
-
- setConf(Configuration) - Method in class org.apache.crunch.types.writable.TupleWritable
-
- setConf(Configuration) - Method in class org.apache.crunch.util.CrunchTool
-
- setConfiguration(Configuration) - Method in class org.apache.crunch.DoFn
-
Called during the setup of an initialized
PType
that
relies on this instance.
- setConfiguration(Configuration) - Method in class org.apache.crunch.fn.CompositeMapFn
-
- setConfiguration(Configuration) - Method in class org.apache.crunch.fn.ExtractKeyFn
-
- setConfiguration(Configuration) - Method in class org.apache.crunch.fn.PairMapFn
-
- setConfiguration(Configuration) - Method in class org.apache.crunch.impl.dist.DistributedPipeline
-
- setConfiguration(Configuration) - Method in class org.apache.crunch.impl.mem.MemPipeline
-
- setConfiguration(Configuration) - Method in interface org.apache.crunch.Pipeline
-
Set the Configuration
to use with this pipeline.
- setContext(TaskInputOutputContext<?, ?, ?, ?>) - Method in class org.apache.crunch.DoFn
-
Called during setup to pass the TaskInputOutputContext
to this
DoFn
instance.
- setContext(TaskInputOutputContext<?, ?, ?, ?>) - Method in class org.apache.crunch.fn.CompositeMapFn
-
- setContext(TaskInputOutputContext<?, ?, ?, ?>) - Method in class org.apache.crunch.fn.ExtractKeyFn
-
- setContext(TaskInputOutputContext<?, ?, ?, ?>) - Method in class org.apache.crunch.fn.PairMapFn
-
- setContext(TaskInputOutputContext<?, ?, ?, ?>) - Method in class org.apache.crunch.types.PGroupedTableType.PairIterableMapFn
-
- setDepartment(CharSequence) - Method in class org.apache.crunch.test.Employee.Builder
-
Sets the value of the 'department' field
- setDepartment(CharSequence) - Method in class org.apache.crunch.test.Employee
-
Sets the value of the 'department' field.
- setMessage(String) - Method in class org.apache.crunch.PipelineCallable
-
Sets a message associated with this callable's execution, especially in case of errors.
- setName(CharSequence) - Method in class org.apache.crunch.test.Employee.Builder
-
Sets the value of the 'name' field
- setName(CharSequence) - Method in class org.apache.crunch.test.Employee
-
Sets the value of the 'name' field.
- setName(CharSequence) - Method in class org.apache.crunch.test.Person.Builder
-
Sets the value of the 'name' field
- setName(CharSequence) - Method in class org.apache.crunch.test.Person
-
Sets the value of the 'name' field.
- setPartitionFile(Configuration, Path) - Static method in class org.apache.crunch.lib.sort.TotalOrderPartitioner
-
- setSalary(int) - Method in class org.apache.crunch.test.Employee.Builder
-
Sets the value of the 'salary' field
- setSalary(Integer) - Method in class org.apache.crunch.test.Employee
-
Sets the value of the 'salary' field.
- setSiblingnames(List<CharSequence>) - Method in class org.apache.crunch.test.Person.Builder
-
Sets the value of the 'siblingnames' field
- setSiblingnames(List<CharSequence>) - Method in class org.apache.crunch.test.Person
-
Sets the value of the 'siblingnames' field.
- setSpecificClassLoader(ClassLoader) - Static method in class org.apache.crunch.types.avro.AvroMode
-
Set the ClassLoader
that will be used for loading Avro org.apache.avro.specific.SpecificRecord
and reflection implementation classes.
- setValue(long) - Method in class org.apache.hadoop.mapred.SparkCounter
-
- SFlatMapFunction<T,R> - Class in org.apache.crunch.fn
-
A Crunch-compatible abstract base class for Spark's FlatMapFunction
.
- SFlatMapFunction() - Constructor for class org.apache.crunch.fn.SFlatMapFunction
-
- SFlatMapFunction2<K,V,R> - Class in org.apache.crunch.fn
-
A Crunch-compatible abstract base class for Spark's FlatMapFunction2
.
- SFlatMapFunction2() - Constructor for class org.apache.crunch.fn.SFlatMapFunction2
-
- SFunction<T,R> - Class in org.apache.crunch.fn
-
A Crunch-compatible abstract base class for Spark's Function
.
- SFunction() - Constructor for class org.apache.crunch.fn.SFunction
-
- SFunction<S,T> - Interface in org.apache.crunch.lambda.fn
-
Serializable version of the Java Function functional interface.
- SFunction2<K,V,R> - Class in org.apache.crunch.fn
-
A Crunch-compatible abstract base class for Spark's Function2
.
- SFunction2() - Constructor for class org.apache.crunch.fn.SFunction2
-
- SFunctions - Class in org.apache.crunch.fn
-
Utility methods for wrapping existing Spark Java API Functions for
Crunch compatibility.
- Shard - Class in org.apache.crunch.lib
-
Utilities for controlling how the data in a PCollection
is balanced across reducers
and output files.
- Shard() - Constructor for class org.apache.crunch.lib.Shard
-
- shard(PCollection<T>, int) - Static method in class org.apache.crunch.lib.Shard
-
Creates a PCollection<T>
that has the same contents as its input argument but will
be written to a fixed number of output files.
- ShardedJoinStrategy<K,U,V> - Class in org.apache.crunch.lib.join
-
JoinStrategy that splits the key space up into shards.
- ShardedJoinStrategy(int) - Constructor for class org.apache.crunch.lib.join.ShardedJoinStrategy
-
Instantiate with a constant number of shards to use for all keys.
- ShardedJoinStrategy(int, int) - Constructor for class org.apache.crunch.lib.join.ShardedJoinStrategy
-
Instantiate with a constant number of shards to use for all keys.
- ShardedJoinStrategy(ShardedJoinStrategy.ShardingStrategy<K>) - Constructor for class org.apache.crunch.lib.join.ShardedJoinStrategy
-
Instantiate with a custom sharding strategy.
- ShardedJoinStrategy(ShardedJoinStrategy.ShardingStrategy<K>, int) - Constructor for class org.apache.crunch.lib.join.ShardedJoinStrategy
-
Instantiate with a custom sharding strategy and a specified number of reducers.
- ShardedJoinStrategy.ShardingStrategy<K> - Interface in org.apache.crunch.lib.join
-
Determines over how many shards a key will be split in a sharded join.
- siblingnames - Variable in class org.apache.crunch.test.Person
-
Deprecated.
- SimpleAggregator() - Constructor for class org.apache.crunch.fn.Aggregators.SimpleAggregator
-
- SingleKeyFn(int) - Constructor for class org.apache.crunch.lib.sort.SortFns.SingleKeyFn
-
- SingleUseIterable<T> - Class in org.apache.crunch.impl
-
Wrapper around a Reducer's input Iterable.
- SingleUseIterable(Iterable<T>) - Constructor for class org.apache.crunch.impl.SingleUseIterable
-
Instantiate around an Iterable that may only be used once.
- size() - Method in class org.apache.crunch.Pair
-
- size() - Method in interface org.apache.crunch.Tuple
-
Returns the number of elements in this Tuple.
- size() - Method in class org.apache.crunch.Tuple3
-
- size() - Method in class org.apache.crunch.Tuple4
-
- size() - Method in class org.apache.crunch.TupleN
-
- size() - Method in class org.apache.crunch.types.writable.TupleWritable
-
The number of children in this Tuple.
- skip(String) - Method in class org.apache.crunch.contrib.text.TokenizerFactory.Builder
-
Sets the regular expression that determines which input characters should be
ignored by the Scanner
that is returned by the constructed
TokenizerFactory
.
- smearHash(int) - Static method in class org.apache.crunch.util.HashUtil
-
Applies a supplemental hashing function to an integer, increasing variability in lower-order bits.
- snappy(T) - Static method in class org.apache.crunch.io.Compress
-
Configure the given output target to be compressed using Snappy.
- Sort - Class in org.apache.crunch.lib
-
Utilities for sorting PCollection
instances.
- Sort() - Constructor for class org.apache.crunch.lib.Sort
-
- sort(PCollection<T>) - Static method in class org.apache.crunch.lib.Sort
-
Sorts the PCollection
using the natural ordering of its elements in ascending order.
- sort(PCollection<T>, Sort.Order) - Static method in class org.apache.crunch.lib.Sort
-
Sorts the PCollection
using the natural order of its elements with the given Order
.
- sort(PCollection<T>, int, Sort.Order) - Static method in class org.apache.crunch.lib.Sort
-
Sorts the PCollection
using the natural ordering of its elements in
the order specified using the given number of reducers.
- sort(PTable<K, V>) - Static method in class org.apache.crunch.lib.Sort
-
Sorts the PTable
using the natural ordering of its keys in ascending order.
- sort(PTable<K, V>, Sort.Order) - Static method in class org.apache.crunch.lib.Sort
-
Sorts the PTable
using the natural ordering of its keys with the given Order
.
- sort(PTable<K, V>, int, Sort.Order) - Static method in class org.apache.crunch.lib.Sort
-
Sorts the PTable
using the natural ordering of its keys in the
order specified with a client-specified number of reducers.
- Sort.ColumnOrder - Class in org.apache.crunch.lib
-
To sort by column 2 ascending then column 1 descending, you would use:
sortPairs(coll, by(2, ASCENDING), by(1, DESCENDING))
Column numbering is 1-based.
- Sort.Order - Enum in org.apache.crunch.lib
-
For signaling the order in which a sort should be done.
- sortAndApply(PTable<K, Pair<V1, V2>>, DoFn<Pair<K, Iterable<Pair<V1, V2>>>, T>, PType<T>) - Static method in class org.apache.crunch.lib.SecondarySort
-
Perform a secondary sort on the given PTable
instance and then apply a
DoFn
to the resulting sorted data to yield an output PCollection<T>
.
- sortAndApply(PTable<K, Pair<V1, V2>>, DoFn<Pair<K, Iterable<Pair<V1, V2>>>, T>, PType<T>, int) - Static method in class org.apache.crunch.lib.SecondarySort
-
Perform a secondary sort on the given PTable
instance and then apply a
DoFn
to the resulting sorted data to yield an output PCollection<T>
, using
the given number of reducers.
- sortAndApply(PTable<K, Pair<V1, V2>>, DoFn<Pair<K, Iterable<Pair<V1, V2>>>, Pair<U, V>>, PTableType<U, V>) - Static method in class org.apache.crunch.lib.SecondarySort
-
Perform a secondary sort on the given PTable
instance and then apply a
DoFn
to the resulting sorted data to yield an output PTable<U, V>
.
- sortAndApply(PTable<K, Pair<V1, V2>>, DoFn<Pair<K, Iterable<Pair<V1, V2>>>, Pair<U, V>>, PTableType<U, V>, int) - Static method in class org.apache.crunch.lib.SecondarySort
-
Perform a secondary sort on the given PTable
instance and then apply a
DoFn
to the resulting sorted data to yield an output PTable<U, V>
, using
the given number of reducers.
- sortComparatorClass(Class<? extends RawComparator>) - Method in class org.apache.crunch.GroupingOptions.Builder
-
- SortExample - Class in org.apache.crunch.examples
-
Simple Crunch tool for running sorting examples from the command line.
- SortExample() - Constructor for class org.apache.crunch.examples.SortExample
-
- SortFns - Class in org.apache.crunch.lib.sort
-
A set of DoFn
s that are used by Crunch's Sort
library.
- SortFns() - Constructor for class org.apache.crunch.lib.sort.SortFns
-
- SortFns.AvroGenericFn<V extends Tuple> - Class in org.apache.crunch.lib.sort
-
Pulls a composite set of keys from an Avro GenericRecord
instance.
- SortFns.KeyExtraction<V extends Tuple> - Class in org.apache.crunch.lib.sort
-
Utility class for encapsulating key extraction logic and serialization information about
key extraction.
- SortFns.SingleKeyFn<V extends Tuple,K> - Class in org.apache.crunch.lib.sort
-
Extracts a single indexed key from a Tuple
instance.
- SortFns.TupleKeyFn<V extends Tuple,K extends Tuple> - Class in org.apache.crunch.lib.sort
-
Extracts a composite key from a Tuple
instance.
- sortPairs(PCollection<Pair<U, V>>, Sort.ColumnOrder...) - Static method in class org.apache.crunch.lib.Sort
-
Sorts the PCollection
of Pair
s using the specified column
ordering.
- sortQuads(PCollection<Tuple4<V1, V2, V3, V4>>, Sort.ColumnOrder...) - Static method in class org.apache.crunch.lib.Sort
-
Sorts the PCollection
of Tuple4
s using the specified column
ordering.
- sortTriples(PCollection<Tuple3<V1, V2, V3>>, Sort.ColumnOrder...) - Static method in class org.apache.crunch.lib.Sort
-
Sorts the PCollection
of Tuple3
s using the specified column
ordering.
- sortTuples(PCollection<T>, Sort.ColumnOrder...) - Static method in class org.apache.crunch.lib.Sort
-
Sorts the PCollection
of tuples using the specified column ordering.
- sortTuples(PCollection<T>, int, Sort.ColumnOrder...) - Static method in class org.apache.crunch.lib.Sort
-
Sorts the
PCollection
of
TupleN
s using the specified column
ordering and a client-specified number of reducers.
- Source<T> - Interface in org.apache.crunch
-
A Source
represents an input data set that is an input to one or more
MapReduce jobs.
- sources(Source<?>...) - Method in class org.apache.crunch.ParallelDoOptions.Builder
-
- sources(Collection<Source<?>>) - Method in class org.apache.crunch.ParallelDoOptions.Builder
-
- sourceTarget(SourceTarget<?>) - Method in class org.apache.crunch.GroupingOptions.Builder
-
Deprecated.
- SourceTarget<T> - Interface in org.apache.crunch
-
An interface for classes that implement both the Source
and the
Target
interfaces.
- SourceTargetHelper - Class in org.apache.crunch.io
-
Functions for configuring the inputs/outputs of MapReduce jobs.
- SourceTargetHelper() - Constructor for class org.apache.crunch.io.SourceTargetHelper
-
- sourceTargets(SourceTarget<?>...) - Method in class org.apache.crunch.GroupingOptions.Builder
-
- sourceTargets(Collection<SourceTarget<?>>) - Method in class org.apache.crunch.GroupingOptions.Builder
-
- sourceTargets(SourceTarget<?>...) - Method in class org.apache.crunch.ParallelDoOptions.Builder
-
- sourceTargets(Collection<SourceTarget<?>>) - Method in class org.apache.crunch.ParallelDoOptions.Builder
-
- SPairFlatMapFunction<T,K,V> - Class in org.apache.crunch.fn
-
A Crunch-compatible abstract base class for Spark's PairFlatMapFunction
.
- SPairFlatMapFunction() - Constructor for class org.apache.crunch.fn.SPairFlatMapFunction
-
- SPairFunction<T,K,V> - Class in org.apache.crunch.fn
-
A Crunch-compatible abstract base class for Spark's PairFunction
.
- SPairFunction() - Constructor for class org.apache.crunch.fn.SPairFunction
-
- SparkCollectFactory - Class in org.apache.crunch.impl.spark.collect
-
- SparkCollectFactory() - Constructor for class org.apache.crunch.impl.spark.collect.SparkCollectFactory
-
- SparkCollection - Interface in org.apache.crunch.impl.spark
-
- SparkComparator - Class in org.apache.crunch.impl.spark
-
- SparkComparator(GroupingOptions, PGroupedTableType, SparkRuntimeContext) - Constructor for class org.apache.crunch.impl.spark.SparkComparator
-
- SparkCounter - Class in org.apache.hadoop.mapred
-
- SparkCounter(String, String, Accumulator<Map<String, Map<String, Long>>>) - Constructor for class org.apache.hadoop.mapred.SparkCounter
-
- SparkCounter(String, String, long) - Constructor for class org.apache.hadoop.mapred.SparkCounter
-
- SparkPartitioner - Class in org.apache.crunch.impl.spark
-
- SparkPartitioner(int) - Constructor for class org.apache.crunch.impl.spark.SparkPartitioner
-
- SparkPipeline - Class in org.apache.crunch.impl.spark
-
- SparkPipeline(String, String) - Constructor for class org.apache.crunch.impl.spark.SparkPipeline
-
- SparkPipeline(String, String, Class<?>) - Constructor for class org.apache.crunch.impl.spark.SparkPipeline
-
- SparkPipeline(String, String, Class<?>, Configuration) - Constructor for class org.apache.crunch.impl.spark.SparkPipeline
-
- SparkPipeline(JavaSparkContext, String) - Constructor for class org.apache.crunch.impl.spark.SparkPipeline
-
- SparkPipeline(JavaSparkContext, String, Class<?>, Configuration) - Constructor for class org.apache.crunch.impl.spark.SparkPipeline
-
- SparkRuntime - Class in org.apache.crunch.impl.spark
-
- SparkRuntime(SparkPipeline, JavaSparkContext, Configuration, Map<PCollectionImpl<?>, Set<Target>>, Map<PCollectionImpl<?>, MaterializableIterable>, Map<PCollection<?>, StorageLevel>, Map<PipelineCallable<?>, Set<Target>>) - Constructor for class org.apache.crunch.impl.spark.SparkRuntime
-
- SparkRuntimeContext - Class in org.apache.crunch.impl.spark
-
- SparkRuntimeContext(String, Accumulator<Map<String, Map<String, Long>>>, Broadcast<byte[]>) - Constructor for class org.apache.crunch.impl.spark.SparkRuntimeContext
-
- SPECIFIC - Static variable in class org.apache.crunch.types.avro.AvroMode
-
Default mode to use for reading and writing Specific
types.
- specifics(Class<T>) - Static method in class org.apache.crunch.types.avro.Avros
-
- split(PCollection<Pair<T, U>>) - Static method in class org.apache.crunch.lib.Channels
-
Splits a
PCollection
of any
Pair
of objects into a Pair of
PCollection}, to allow for the output of a DoFn to be handled using
separate channels.
- split(PCollection<Pair<T, U>>, PType<T>, PType<U>) - Static method in class org.apache.crunch.lib.Channels
-
Splits a
PCollection
of any
Pair
of objects into a Pair of
PCollection}, to allow for the output of a DoFn to be handled using
separate channels.
- SPredicate<T> - Interface in org.apache.crunch.lambda.fn
-
Serializable version of the Java Predicate functional interface.
- SSupplier<T> - Interface in org.apache.crunch.lambda.fn
-
Serializable version of the Java Supplier functional interface.
- StageResult(String, Counters) - Constructor for class org.apache.crunch.PipelineResult.StageResult
-
- StageResult(String, Counters, long, long) - Constructor for class org.apache.crunch.PipelineResult.StageResult
-
- StageResult(String, String, Counters, long, long, long, long) - Constructor for class org.apache.crunch.PipelineResult.StageResult
-
- status - Variable in class org.apache.crunch.PipelineResult
-
- STRING_CONCAT(String, boolean) - Static method in class org.apache.crunch.fn.Aggregators
-
Concatenate strings, with a separator between strings.
- STRING_CONCAT(String, boolean, long, long) - Static method in class org.apache.crunch.fn.Aggregators
-
Concatenate strings, with a separator between strings.
- STRING_TO_UTF8 - Static variable in class org.apache.crunch.types.avro.Avros
-
- strings() - Static method in class org.apache.crunch.types.avro.Avros
-
- strings() - Method in class org.apache.crunch.types.avro.AvroTypeFamily
-
- strings() - Method in interface org.apache.crunch.types.PTypeFamily
-
- strings() - Static method in class org.apache.crunch.types.writable.Writables
-
- strings() - Method in class org.apache.crunch.types.writable.WritableTypeFamily
-
- succeeded() - Method in class org.apache.crunch.PipelineResult
-
- SUM_BIGDECIMALS() - Static method in class org.apache.crunch.fn.Aggregators
-
- SUM_BIGINTS() - Static method in class org.apache.crunch.fn.Aggregators
-
- SUM_DOUBLES() - Static method in class org.apache.crunch.fn.Aggregators
-
Sum up all double
values.
- SUM_FLOATS() - Static method in class org.apache.crunch.fn.Aggregators
-
Sum up all float
values.
- SUM_INTS() - Static method in class org.apache.crunch.fn.Aggregators
-
Sum up all int
values.
- SUM_LONGS() - Static method in class org.apache.crunch.fn.Aggregators
-
Sum up all long
values.
- SwapFn<V1,V2> - Class in org.apache.crunch.fn
-
Swap the elements of a Pair
type.
- SwapFn() - Constructor for class org.apache.crunch.fn.SwapFn
-
- swapKeyValue(PTable<K, V>) - Static method in class org.apache.crunch.lib.PTables
-
Swap the key and value part of a table.
- tableOf(S, T, Object...) - Static method in class org.apache.crunch.impl.mem.MemPipeline
-
- tableOf(Iterable<Pair<S, T>>) - Static method in class org.apache.crunch.impl.mem.MemPipeline
-
- tableOf(PType<K>, PType<V>) - Static method in class org.apache.crunch.types.avro.Avros
-
A table type with an Avro type as key and as value.
- tableOf(PType<K>, PType<V>) - Method in class org.apache.crunch.types.avro.AvroTypeFamily
-
- tableOf(PType<K>, PType<V>) - Method in interface org.apache.crunch.types.PTypeFamily
-
- tableOf(PType<K>, PType<V>) - Static method in class org.apache.crunch.types.writable.Writables
-
- tableOf(PType<K>, PType<V>) - Method in class org.apache.crunch.types.writable.WritableTypeFamily
-
- TableSource<K,V> - Interface in org.apache.crunch
-
The interface
Source
implementations that return a
PTable
.
- TableSourceTarget<K,V> - Interface in org.apache.crunch
-
An interface for classes that implement both the TableSource
and the
Target
interfaces.
- tableType(PTableType<K, V>) - Static method in class org.apache.crunch.fn.SwapFn
-
- Target - Interface in org.apache.crunch
-
A Target
represents the output destination of a Crunch PCollection
in the context of a Crunch job.
- Target.WriteMode - Enum in org.apache.crunch
-
An enum to represent different options the client may specify
for handling the case where the output path, table, etc.
- targets(Target...) - Method in class org.apache.crunch.ParallelDoOptions.Builder
-
- targets(Collection<Target>) - Method in class org.apache.crunch.ParallelDoOptions.Builder
-
- tempDir - Variable in class org.apache.crunch.test.CrunchTestSupport
-
- TemporaryPath - Class in org.apache.crunch.test
-
Creates a temporary directory for a test case and destroys it afterwards.
- TemporaryPath(String...) - Constructor for class org.apache.crunch.test.TemporaryPath
-
Construct TemporaryPath
.
- TestCounters - Class in org.apache.crunch.test
-
A utility class used during unit testing to update and read counters.
- TestCounters() - Constructor for class org.apache.crunch.test.TestCounters
-
- textFile(String) - Static method in class org.apache.crunch.io.At
-
Creates a SourceTarget<String>
instance for the text file(s) at the given path name.
- textFile(Path) - Static method in class org.apache.crunch.io.At
-
Creates a SourceTarget<String>
instance for the text file(s) at the given Path
.
- textFile(String, PType<T>) - Static method in class org.apache.crunch.io.At
-
Creates a SourceTarget<T>
instance for the text file(s) at the given path name using
the provided PType<T>
to convert the input text.
- textFile(Path, PType<T>) - Static method in class org.apache.crunch.io.At
-
Creates a SourceTarget<T>
instance for the text file(s) at the given Path
using
the provided PType<T>
to convert the input text.
- textFile(String) - Static method in class org.apache.crunch.io.From
-
Creates a Source<String>
instance for the text file(s) at the given path name.
- textFile(Path) - Static method in class org.apache.crunch.io.From
-
Creates a Source<String>
instance for the text file(s) at the given Path
.
- textFile(List<Path>) - Static method in class org.apache.crunch.io.From
-
Creates a Source<String>
instance for the text file(s) at the given Path
s.
- textFile(String, PType<T>) - Static method in class org.apache.crunch.io.From
-
Creates a Source<T>
instance for the text file(s) at the given path name using
the provided PType<T>
to convert the input text.
- textFile(Path, PType<T>) - Static method in class org.apache.crunch.io.From
-
Creates a Source<T>
instance for the text file(s) at the given Path
using
the provided PType<T>
to convert the input text.
- textFile(List<Path>, PType<T>) - Static method in class org.apache.crunch.io.From
-
Creates a Source<T>
instance for the text file(s) at the given Path
s using
the provided PType<T>
to convert the input text.
- textFile(String) - Static method in class org.apache.crunch.io.To
-
Creates a Target
at the given path name that writes data to
text files.
- textFile(Path) - Static method in class org.apache.crunch.io.To
-
Creates a Target
at the given Path
that writes data to
text files.
- third() - Method in class org.apache.crunch.Tuple3
-
- third() - Method in class org.apache.crunch.Tuple4
-
- thrifts(Class<T>, PTypeFamily) - Static method in class org.apache.crunch.types.PTypes
-
Constructs a PType for a Thrift record.
- To - Class in org.apache.crunch.io
-
Static factory methods for creating common
Target
types.
- To() - Constructor for class org.apache.crunch.io.To
-
- ToByteArrayFunction - Class in org.apache.crunch.impl.spark.collect
-
- ToByteArrayFunction() - Constructor for class org.apache.crunch.impl.spark.collect.ToByteArrayFunction
-
- toBytes(T) - Method in class org.apache.crunch.impl.spark.serde.AvroSerDe
-
- toBytes(T) - Method in interface org.apache.crunch.impl.spark.serde.SerDe
-
- toBytes(Writable) - Method in class org.apache.crunch.impl.spark.serde.WritableSerDe
-
- toCombineFn(Aggregator<V>) - Static method in class org.apache.crunch.fn.Aggregators
-
- toCombineFn(Aggregator<V>, PType<V>) - Static method in class org.apache.crunch.fn.Aggregators
-
Wrap a
CombineFn
adapter around the given aggregator.
- Tokenizer - Class in org.apache.crunch.contrib.text
-
Manages a
Scanner
instance and provides support for returning only a subset
of the fields returned by the underlying
Scanner
.
- Tokenizer(Scanner, Set<Integer>, boolean) - Constructor for class org.apache.crunch.contrib.text.Tokenizer
-
Create a new Tokenizer
instance.
- TokenizerFactory - Class in org.apache.crunch.contrib.text
-
Factory class that constructs
Tokenizer
instances for input strings that use a fixed
set of delimiters, skip patterns, locales, and sets of indices to keep or drop.
- TokenizerFactory.Builder - Class in org.apache.crunch.contrib.text
-
A class for constructing new TokenizerFactory
instances using the Builder pattern.
- top(int) - Method in class org.apache.crunch.impl.dist.collect.PTableBase
-
- top(PTable<K, V>, int, boolean) - Static method in class org.apache.crunch.lib.Aggregate
-
Selects the top N pairs from the given table, with sorting being performed on the values (i.e.
- top(int) - Method in interface org.apache.crunch.PTable
-
Returns a PTable made up of the pairs in this PTable with the largest value
field.
- TopKCombineFn(int, boolean, PType<Pair<K, V>>) - Constructor for class org.apache.crunch.lib.Aggregate.TopKCombineFn
-
- TopKFn(int, boolean, PType<Pair<K, V>>) - Constructor for class org.apache.crunch.lib.Aggregate.TopKFn
-
- TopList - Class in org.apache.crunch.lib
-
Tools for creating top lists of items in PTables and PCollections
- TopList() - Constructor for class org.apache.crunch.lib.TopList
-
- topNYbyX(PTable<X, Y>, int) - Static method in class org.apache.crunch.lib.TopList
-
Create a top-list of elements in the provided PTable, categorised by the key of the input table and using the count
of the value part of the input table.
- toString() - Method in class org.apache.crunch.impl.dist.collect.PCollectionImpl
-
- toString() - Method in class org.apache.crunch.lib.Sort.ColumnOrder
-
- toString() - Method in class org.apache.crunch.Pair
-
- toString() - Method in class org.apache.crunch.Tuple3
-
- toString() - Method in class org.apache.crunch.Tuple4
-
- toString() - Method in class org.apache.crunch.TupleN
-
- toString() - Method in class org.apache.crunch.types.writable.TupleWritable
-
Convert Tuple to String as in the following.
- TotalBytesByIP - Class in org.apache.crunch.examples
-
- TotalBytesByIP() - Constructor for class org.apache.crunch.examples.TotalBytesByIP
-
- TotalOrderPartitioner<K,V> - Class in org.apache.crunch.lib.sort
-
A partition-aware Partitioner
instance that can work with either Avro or Writable-formatted
keys.
- TotalOrderPartitioner() - Constructor for class org.apache.crunch.lib.sort.TotalOrderPartitioner
-
- TotalOrderPartitioner.BinarySearchNode<K> - Class in org.apache.crunch.lib.sort
-
- TotalOrderPartitioner.Node<T> - Interface in org.apache.crunch.lib.sort
-
Interface to the partitioner to locate a key in the partition keyset.
- TotalWordCount - Class in org.apache.crunch.examples
-
- TotalWordCount() - Constructor for class org.apache.crunch.examples.TotalWordCount
-
- tripAggregator(Aggregator<V1>, Aggregator<V2>, Aggregator<V3>) - Static method in class org.apache.crunch.fn.Aggregators
-
Apply separate aggregators to each component of a
Tuple3
.
- TripIterable(Iterable<A>, Iterable<B>, Iterable<C>) - Constructor for class org.apache.crunch.util.Tuples.TripIterable
-
- triples(PType<V1>, PType<V2>, PType<V3>) - Static method in class org.apache.crunch.types.avro.Avros
-
- triples(PType<V1>, PType<V2>, PType<V3>) - Method in class org.apache.crunch.types.avro.AvroTypeFamily
-
- triples(PType<V1>, PType<V2>, PType<V3>) - Method in interface org.apache.crunch.types.PTypeFamily
-
- triples(PType<V1>, PType<V2>, PType<V3>) - Static method in class org.apache.crunch.types.writable.Writables
-
- triples(PType<V1>, PType<V2>, PType<V3>) - Method in class org.apache.crunch.types.writable.WritableTypeFamily
-
- Tuple - Interface in org.apache.crunch
-
A fixed-size collection of Objects, used in Crunch for representing joins
between PCollection
s.
- Tuple2MapFunction<K,V> - Class in org.apache.crunch.impl.spark.fn
-
- Tuple2MapFunction(MapFn<Pair<K, V>, Pair<K, V>>, SparkRuntimeContext) - Constructor for class org.apache.crunch.impl.spark.fn.Tuple2MapFunction
-
- tuple2PairFunc() - Static method in class org.apache.crunch.impl.spark.GuavaUtils
-
- Tuple3<V1,V2,V3> - Class in org.apache.crunch
-
A convenience class for three-element
Tuple
s.
- Tuple3(V1, V2, V3) - Constructor for class org.apache.crunch.Tuple3
-
- TUPLE3 - Static variable in class org.apache.crunch.types.TupleFactory
-
- Tuple3.Collect<V1,V2,V3> - Class in org.apache.crunch
-
- Tuple4<V1,V2,V3,V4> - Class in org.apache.crunch
-
A convenience class for four-element
Tuple
s.
- Tuple4(V1, V2, V3, V4) - Constructor for class org.apache.crunch.Tuple4
-
- TUPLE4 - Static variable in class org.apache.crunch.types.TupleFactory
-
- Tuple4.Collect<V1,V2,V3,V4> - Class in org.apache.crunch
-
- tupleAggregator(Aggregator<?>...) - Static method in class org.apache.crunch.fn.Aggregators
-
Apply separate aggregators to each component of a
Tuple
.
- TupleDeepCopier<T extends Tuple> - Class in org.apache.crunch.types
-
Performs deep copies (based on underlying PType deep copying) of Tuple-based objects.
- TupleDeepCopier(Class<T>, PType...) - Constructor for class org.apache.crunch.types.TupleDeepCopier
-
- TupleFactory<T extends Tuple> - Class in org.apache.crunch.types
-
- TupleFactory() - Constructor for class org.apache.crunch.types.TupleFactory
-
- TupleKeyFn(int[], TupleFactory) - Constructor for class org.apache.crunch.lib.sort.SortFns.TupleKeyFn
-
- TupleN - Class in org.apache.crunch
-
A
Tuple
instance for an arbitrary number of values.
- TupleN(Object...) - Constructor for class org.apache.crunch.TupleN
-
- TUPLEN - Static variable in class org.apache.crunch.types.TupleFactory
-
- TupleNIterable(Iterable<?>...) - Constructor for class org.apache.crunch.util.Tuples.TupleNIterable
-
- TupleObjectInspector<T extends Tuple> - Class in org.apache.crunch.types.orc
-
An object inspector to define the structure of Crunch Tuples
- TupleObjectInspector(TupleFactory<T>, PType...) - Constructor for class org.apache.crunch.types.orc.TupleObjectInspector
-
- tuples(PType...) - Static method in class org.apache.crunch.types.avro.Avros
-
- tuples(Class<T>, PType...) - Static method in class org.apache.crunch.types.avro.Avros
-
- tuples(PType<?>...) - Method in class org.apache.crunch.types.avro.AvroTypeFamily
-
- tuples(Class<T>, PType<?>...) - Method in class org.apache.crunch.types.avro.AvroTypeFamily
-
- tuples(PType...) - Static method in class org.apache.crunch.types.orc.Orcs
-
Create a tuple-based PType.
- tuples(PType<?>...) - Method in interface org.apache.crunch.types.PTypeFamily
-
- tuples(Class<T>, PType<?>...) - Method in interface org.apache.crunch.types.PTypeFamily
-
- tuples(PType...) - Static method in class org.apache.crunch.types.writable.Writables
-
- tuples(Class<T>, PType...) - Static method in class org.apache.crunch.types.writable.Writables
-
- tuples(PType<?>...) - Method in class org.apache.crunch.types.writable.WritableTypeFamily
-
- tuples(Class<T>, PType<?>...) - Method in class org.apache.crunch.types.writable.WritableTypeFamily
-
- Tuples - Class in org.apache.crunch.util
-
Utilities for working with subclasses of the Tuple
interface.
- Tuples() - Constructor for class org.apache.crunch.util.Tuples
-
- Tuples.PairIterable<S,T> - Class in org.apache.crunch.util
-
- Tuples.QuadIterable<A,B,C,D> - Class in org.apache.crunch.util
-
- Tuples.TripIterable<A,B,C> - Class in org.apache.crunch.util
-
- Tuples.TupleNIterable - Class in org.apache.crunch.util
-
- TupleWritable - Class in org.apache.crunch.types.writable
-
A serialization format for
Tuple
.
- TupleWritable() - Constructor for class org.apache.crunch.types.writable.TupleWritable
-
Create an empty tuple with no allocated storage for writables.
- TupleWritable(Writable[]) - Constructor for class org.apache.crunch.types.writable.TupleWritable
-
- TupleWritable(Writable[], int[]) - Constructor for class org.apache.crunch.types.writable.TupleWritable
-
Initialize tuple with storage; unknown whether any of them contain
"written" values.
- TupleWritable.Comparator - Class in org.apache.crunch.types.writable
-
- TupleWritableComparator() - Constructor for class org.apache.crunch.lib.join.JoinUtils.TupleWritableComparator
-
- TupleWritableComparator - Class in org.apache.crunch.lib.sort
-
- TupleWritableComparator() - Constructor for class org.apache.crunch.lib.sort.TupleWritableComparator
-
- TupleWritablePartitioner() - Constructor for class org.apache.crunch.lib.join.JoinUtils.TupleWritablePartitioner
-
- typedCollectionOf(PType<T>, T...) - Static method in class org.apache.crunch.impl.mem.MemPipeline
-
- typedCollectionOf(PType<T>, Iterable<T>) - Static method in class org.apache.crunch.impl.mem.MemPipeline
-
- typedTableOf(PTableType<S, T>, S, T, Object...) - Static method in class org.apache.crunch.impl.mem.MemPipeline
-
- typedTableOf(PTableType<S, T>, Iterable<Pair<S, T>>) - Static method in class org.apache.crunch.impl.mem.MemPipeline
-