Package | Description |
---|---|
org.apache.crunch |
Client-facing API and core abstractions.
|
org.apache.crunch.contrib.bloomfilter |
Support for creating Bloom Filters.
|
org.apache.crunch.fn |
Commonly used functions for manipulating collections.
|
org.apache.crunch.impl.dist.collect | |
org.apache.crunch.impl.spark | |
org.apache.crunch.impl.spark.collect | |
org.apache.crunch.impl.spark.fn | |
org.apache.crunch.lambda |
Alternative Crunch API using Java 8 features to allow construction of pipelines using lambda functions and method
references.
|
org.apache.crunch.lib |
Joining, sorting, aggregating, and other commonly used functionality.
|
org.apache.crunch.lib.join |
Inner and outer joins on collections.
|
org.apache.crunch.lib.sort | |
org.apache.crunch.types |
Common functionality for business object serialization.
|
org.apache.crunch.util |
An assorted set of utilities.
|
Modifier and Type | Class and Description |
---|---|
class |
CombineFn<S,T>
|
class |
FilterFn<T>
A
DoFn for the common case of filtering the members of a
PCollection based on a boolean condition. |
class |
MapFn<S,T>
A
DoFn for the common case of emitting exactly one value for each
input record. |
Modifier and Type | Method and Description |
---|---|
<K,V> PTable<K,V> |
PCollection.parallelDo(DoFn<S,Pair<K,V>> doFn,
PTableType<K,V> type)
Similar to the other
parallelDo instance, but returns a
PTable instance instead of a PCollection . |
<T> PCollection<T> |
PCollection.parallelDo(DoFn<S,T> doFn,
PType<T> type)
Applies the given doFn to the elements of this
PCollection and
returns a new PCollection that is the output of this processing. |
<K,V> PTable<K,V> |
PCollection.parallelDo(String name,
DoFn<S,Pair<K,V>> doFn,
PTableType<K,V> type)
Similar to the other
parallelDo instance, but returns a
PTable instance instead of a PCollection . |
<K,V> PTable<K,V> |
PCollection.parallelDo(String name,
DoFn<S,Pair<K,V>> doFn,
PTableType<K,V> type,
ParallelDoOptions options)
Similar to the other
parallelDo instance, but returns a
PTable instance instead of a PCollection . |
<T> PCollection<T> |
PCollection.parallelDo(String name,
DoFn<S,T> doFn,
PType<T> type)
Applies the given doFn to the elements of this
PCollection and
returns a new PCollection that is the output of this processing. |
<T> PCollection<T> |
PCollection.parallelDo(String name,
DoFn<S,T> doFn,
PType<T> type,
ParallelDoOptions options)
Applies the given doFn to the elements of this
PCollection and
returns a new PCollection that is the output of this processing. |
Modifier and Type | Class and Description |
---|---|
class |
BloomFilterFn<S>
The class is responsible for generating keys that are used in a BloomFilter
|
Modifier and Type | Class and Description |
---|---|
class |
CompositeMapFn<R,S,T> |
class |
ExtractKeyFn<K,V>
Wrapper function for converting a key-from-value extractor
MapFn<V, K> into a
key-value pair extractor that is used to convert from a PCollection<V> to a
PTable<K, V> . |
class |
IdentityFn<T> |
class |
PairMapFn<K,V,S,T> |
class |
SDoubleFlatMapFunction<T>
A Crunch-compatible abstract base class for Spark's
DoubleFlatMapFunction . |
class |
SDoubleFunction<T>
A Crunch-compatible abstract base class for Spark's
DoubleFunction . |
class |
SFlatMapFunction<T,R>
A Crunch-compatible abstract base class for Spark's
FlatMapFunction . |
class |
SFlatMapFunction2<K,V,R>
A Crunch-compatible abstract base class for Spark's
FlatMapFunction2 . |
class |
SFunction<T,R>
A Crunch-compatible abstract base class for Spark's
Function . |
class |
SFunction2<K,V,R>
A Crunch-compatible abstract base class for Spark's
Function2 . |
class |
SPairFlatMapFunction<T,K,V>
A Crunch-compatible abstract base class for Spark's
PairFlatMapFunction . |
class |
SPairFunction<T,K,V>
A Crunch-compatible abstract base class for Spark's
PairFunction . |
class |
SwapFn<V1,V2>
Swap the elements of a
Pair type. |
Modifier and Type | Method and Description |
---|---|
<S,T> BaseDoCollection<T> |
PCollectionFactory.createDoCollection(String name,
PCollectionImpl<S> chainingCollection,
DoFn<S,T> fn,
PType<T> type,
ParallelDoOptions options) |
<S,K,V> BaseDoTable<K,V> |
PCollectionFactory.createDoTable(String name,
PCollectionImpl<S> chainingCollection,
CombineFn<K,V> combineFn,
DoFn<S,Pair<K,V>> fn,
PTableType<K,V> type) |
<S,K,V> BaseDoTable<K,V> |
PCollectionFactory.createDoTable(String name,
PCollectionImpl<S> chainingCollection,
DoFn<S,Pair<K,V>> fn,
PTableType<K,V> type,
ParallelDoOptions options) |
<K,V> PTable<K,V> |
PCollectionImpl.parallelDo(DoFn<S,Pair<K,V>> fn,
PTableType<K,V> type) |
<T> PCollection<T> |
PCollectionImpl.parallelDo(DoFn<S,T> fn,
PType<T> type) |
<K,V> PTable<K,V> |
PCollectionImpl.parallelDo(String name,
DoFn<S,Pair<K,V>> fn,
PTableType<K,V> type) |
<K,V> PTable<K,V> |
PCollectionImpl.parallelDo(String name,
DoFn<S,Pair<K,V>> fn,
PTableType<K,V> type,
ParallelDoOptions options) |
<T> PCollection<T> |
PCollectionImpl.parallelDo(String name,
DoFn<S,T> fn,
PType<T> type) |
<T> PCollection<T> |
PCollectionImpl.parallelDo(String name,
DoFn<S,T> fn,
PType<T> type,
ParallelDoOptions options) |
Modifier and Type | Method and Description |
---|---|
void |
SparkRuntimeContext.initialize(DoFn<?,?> fn,
Integer tid) |
Modifier and Type | Method and Description |
---|---|
<S,T> BaseDoCollection<T> |
SparkCollectFactory.createDoCollection(String name,
PCollectionImpl<S> parent,
DoFn<S,T> fn,
PType<T> type,
ParallelDoOptions options) |
<S,K,V> BaseDoTable<K,V> |
SparkCollectFactory.createDoTable(String name,
PCollectionImpl<S> parent,
CombineFn<K,V> combineFn,
DoFn<S,Pair<K,V>> fn,
PTableType<K,V> type) |
<S,K,V> BaseDoTable<K,V> |
SparkCollectFactory.createDoTable(String name,
PCollectionImpl<S> parent,
DoFn<S,Pair<K,V>> fn,
PTableType<K,V> type,
ParallelDoOptions options) |
Constructor and Description |
---|
CrunchIterable(DoFn<S,T> fn,
Iterator<S> input) |
FlatMapIndexFn(DoFn<S,T> fn,
boolean convertInput,
SparkRuntimeContext ctxt) |
FlatMapPairDoFn(DoFn<Pair<K,V>,T> fn,
SparkRuntimeContext ctxt) |
PairFlatMapDoFn(DoFn<T,Pair<K,V>> fn,
SparkRuntimeContext ctxt) |
Modifier and Type | Method and Description |
---|---|
default <K,V> LTable<K,V> |
LCollection.parallelDo(DoFn<S,Pair<K,V>> fn,
PTableType<K,V> pType)
|
default <T> LCollection<T> |
LCollection.parallelDo(DoFn<S,T> fn,
PType<T> pType)
Transform this LCollection using a standard Crunch
DoFn |
Modifier and Type | Class and Description |
---|---|
static class |
Aggregate.TopKCombineFn<K,V> |
static class |
Aggregate.TopKFn<K,V> |
Modifier and Type | Method and Description |
---|---|
static <K,V,T> DoFn<Pair<K,Iterable<V>>,T> |
DoFns.detach(DoFn<Pair<K,Iterable<V>>,T> reduceFn,
PType<V> valueType)
"Reduce" DoFn wrapper which detaches the values in the iterable, preventing the unexpected behaviour related to
object reuse often observed when using Avro.
|
Modifier and Type | Method and Description |
---|---|
static <K,V,T> DoFn<Pair<K,Iterable<V>>,T> |
DoFns.detach(DoFn<Pair<K,Iterable<V>>,T> reduceFn,
PType<V> valueType)
"Reduce" DoFn wrapper which detaches the values in the iterable, preventing the unexpected behaviour related to
object reuse often observed when using Avro.
|
static <K,V1,V2,U,V> |
SecondarySort.sortAndApply(PTable<K,Pair<V1,V2>> input,
DoFn<Pair<K,Iterable<Pair<V1,V2>>>,Pair<U,V>> doFn,
PTableType<U,V> ptype)
Perform a secondary sort on the given
PTable instance and then apply a
DoFn to the resulting sorted data to yield an output PTable<U, V> . |
static <K,V1,V2,U,V> |
SecondarySort.sortAndApply(PTable<K,Pair<V1,V2>> input,
DoFn<Pair<K,Iterable<Pair<V1,V2>>>,Pair<U,V>> doFn,
PTableType<U,V> ptype,
int numReducers)
Perform a secondary sort on the given
PTable instance and then apply a
DoFn to the resulting sorted data to yield an output PTable<U, V> , using
the given number of reducers. |
static <K,V1,V2,T> |
SecondarySort.sortAndApply(PTable<K,Pair<V1,V2>> input,
DoFn<Pair<K,Iterable<Pair<V1,V2>>>,T> doFn,
PType<T> ptype)
Perform a secondary sort on the given
PTable instance and then apply a
DoFn to the resulting sorted data to yield an output PCollection<T> . |
static <K,V1,V2,T> |
SecondarySort.sortAndApply(PTable<K,Pair<V1,V2>> input,
DoFn<Pair<K,Iterable<Pair<V1,V2>>>,T> doFn,
PType<T> ptype,
int numReducers)
Perform a secondary sort on the given
PTable instance and then apply a
DoFn to the resulting sorted data to yield an output PCollection<T> , using
the given number of reducers. |
Modifier and Type | Class and Description |
---|---|
class |
FullOuterJoinFn<K,U,V>
Used to perform the last step of an full outer join.
|
class |
InnerJoinFn<K,U,V>
Used to perform the last step of an inner join.
|
class |
JoinFn<K,U,V>
Represents a
DoFn for performing joins. |
class |
LeftOuterJoinFn<K,U,V>
Used to perform the last step of an left outer join.
|
class |
RightOuterJoinFn<K,U,V>
Used to perform the last step of an right outer join.
|
Modifier and Type | Method and Description |
---|---|
static <K,U,V,T> PCollection<T> |
OneToManyJoin.oneToManyJoin(PTable<K,U> left,
PTable<K,V> right,
DoFn<Pair<U,Iterable<V>>,T> postProcessFn,
PType<T> ptype)
Performs a join on two tables, where the left table only contains a single
value per key.
|
static <K,U,V,T> PCollection<T> |
OneToManyJoin.oneToManyJoin(PTable<K,U> left,
PTable<K,V> right,
DoFn<Pair<U,Iterable<V>>,T> postProcessFn,
PType<T> ptype,
int numReducers)
Supports a user-specified number of reducers for the one-to-many join.
|
Modifier and Type | Class and Description |
---|---|
static class |
SortFns.AvroGenericFn<V extends Tuple>
Pulls a composite set of keys from an Avro
GenericRecord instance. |
static class |
SortFns.SingleKeyFn<V extends Tuple,K>
Extracts a single indexed key from a
Tuple instance. |
static class |
SortFns.TupleKeyFn<V extends Tuple,K extends Tuple>
Extracts a composite key from a
Tuple instance. |
Modifier and Type | Class and Description |
---|---|
static class |
PGroupedTableType.PairIterableMapFn<K,V> |
Modifier and Type | Method and Description |
---|---|
static <M extends com.google.protobuf.Message> |
Protos.lineParser(String sep,
Class<M> msgClass) |
Constructor and Description |
---|
DelegatingReadableData(ReadableData<S> delegate,
DoFn<S,T> fn) |
DoFnIterator(Iterator<S> iter,
DoFn<S,T> fn) |
Copyright © 2016 The Apache Software Foundation. All rights reserved.