| Package | Description | 
|---|---|
| org.apache.crunch | Client-facing API and core abstractions. | 
| org.apache.crunch.contrib.bloomfilter | Support for creating Bloom Filters. | 
| org.apache.crunch.fn | Commonly used functions for manipulating collections. | 
| org.apache.crunch.impl.dist.collect | |
| org.apache.crunch.impl.spark | |
| org.apache.crunch.impl.spark.collect | |
| org.apache.crunch.impl.spark.fn | |
| org.apache.crunch.lib | Joining, sorting, aggregating, and other commonly used functionality. | 
| org.apache.crunch.lib.join | Inner and outer joins on collections. | 
| org.apache.crunch.lib.sort | |
| org.apache.crunch.types | Common functionality for business object serialization. | 
| org.apache.crunch.util | An assorted set of utilities. | 
| Modifier and Type | Class and Description | 
|---|---|
| class  | CombineFn<S,T> | 
| class  | FilterFn<T>A  DoFnfor the common case of filtering the members of aPCollectionbased on a boolean condition. | 
| class  | MapFn<S,T>A  DoFnfor the common case of emitting exactly one value for each
 input record. | 
| Modifier and Type | Method and Description | 
|---|---|
| <K,V> PTable<K,V> | PCollection. parallelDo(DoFn<S,Pair<K,V>> doFn,
          PTableType<K,V> type)Similar to the other  parallelDoinstance, but returns aPTableinstance instead of aPCollection. | 
| <T> PCollection<T> | PCollection. parallelDo(DoFn<S,T> doFn,
          PType<T> type)Applies the given doFn to the elements of this  PCollectionand
 returns a newPCollectionthat is the output of this processing. | 
| <K,V> PTable<K,V> | PCollection. parallelDo(String name,
          DoFn<S,Pair<K,V>> doFn,
          PTableType<K,V> type)Similar to the other  parallelDoinstance, but returns aPTableinstance instead of aPCollection. | 
| <K,V> PTable<K,V> | PCollection. parallelDo(String name,
          DoFn<S,Pair<K,V>> doFn,
          PTableType<K,V> type,
          ParallelDoOptions options)Similar to the other  parallelDoinstance, but returns aPTableinstance instead of aPCollection. | 
| <T> PCollection<T> | PCollection. parallelDo(String name,
          DoFn<S,T> doFn,
          PType<T> type)Applies the given doFn to the elements of this  PCollectionand
 returns a newPCollectionthat is the output of this processing. | 
| <T> PCollection<T> | PCollection. parallelDo(String name,
          DoFn<S,T> doFn,
          PType<T> type,
          ParallelDoOptions options)Applies the given doFn to the elements of this  PCollectionand
 returns a newPCollectionthat is the output of this processing. | 
| Modifier and Type | Class and Description | 
|---|---|
| class  | BloomFilterFn<S>The class is responsible for generating keys that are used in a BloomFilter | 
| Modifier and Type | Class and Description | 
|---|---|
| class  | CompositeMapFn<R,S,T> | 
| class  | ExtractKeyFn<K,V>Wrapper function for converting a  MapFninto a key-value pair that is
 used to convert from aPCollection<V>to aPTable<K, V>. | 
| class  | IdentityFn<T> | 
| class  | PairMapFn<K,V,S,T> | 
| class  | SDoubleFlatMapFunction<T>A Crunch-compatible abstract base class for Spark's  DoubleFlatMapFunction. | 
| class  | SDoubleFunction<T>A Crunch-compatible abstract base class for Spark's  DoubleFunction. | 
| class  | SFlatMapFunction<T,R>A Crunch-compatible abstract base class for Spark's  FlatMapFunction. | 
| class  | SFlatMapFunction2<K,V,R>A Crunch-compatible abstract base class for Spark's  FlatMapFunction2. | 
| class  | SFunction<T,R>A Crunch-compatible abstract base class for Spark's  Function. | 
| class  | SFunction2<K,V,R>A Crunch-compatible abstract base class for Spark's  Function2. | 
| class  | SPairFlatMapFunction<T,K,V>A Crunch-compatible abstract base class for Spark's  PairFlatMapFunction. | 
| class  | SPairFunction<T,K,V>A Crunch-compatible abstract base class for Spark's  PairFunction. | 
| class  | SwapFn<V1,V2>Swap the elements of a  Pairtype. | 
| Modifier and Type | Method and Description | 
|---|---|
| <S,T> BaseDoCollection<T> | PCollectionFactory. createDoCollection(String name,
                  PCollectionImpl<S> chainingCollection,
                  DoFn<S,T> fn,
                  PType<T> type,
                  ParallelDoOptions options) | 
| <S,K,V> BaseDoTable<K,V> | PCollectionFactory. createDoTable(String name,
             PCollectionImpl<S> chainingCollection,
             CombineFn<K,V> combineFn,
             DoFn<S,Pair<K,V>> fn,
             PTableType<K,V> type) | 
| <S,K,V> BaseDoTable<K,V> | PCollectionFactory. createDoTable(String name,
             PCollectionImpl<S> chainingCollection,
             DoFn<S,Pair<K,V>> fn,
             PTableType<K,V> type,
             ParallelDoOptions options) | 
| <K,V> PTable<K,V> | PCollectionImpl. parallelDo(DoFn<S,Pair<K,V>> fn,
          PTableType<K,V> type) | 
| <T> PCollection<T> | PCollectionImpl. parallelDo(DoFn<S,T> fn,
          PType<T> type) | 
| <K,V> PTable<K,V> | PCollectionImpl. parallelDo(String name,
          DoFn<S,Pair<K,V>> fn,
          PTableType<K,V> type) | 
| <K,V> PTable<K,V> | PCollectionImpl. parallelDo(String name,
          DoFn<S,Pair<K,V>> fn,
          PTableType<K,V> type,
          ParallelDoOptions options) | 
| <T> PCollection<T> | PCollectionImpl. parallelDo(String name,
          DoFn<S,T> fn,
          PType<T> type) | 
| <T> PCollection<T> | PCollectionImpl. parallelDo(String name,
          DoFn<S,T> fn,
          PType<T> type,
          ParallelDoOptions options) | 
| Modifier and Type | Method and Description | 
|---|---|
| void | SparkRuntimeContext. initialize(DoFn<?,?> fn,
          Integer tid) | 
| Modifier and Type | Method and Description | 
|---|---|
| <S,T> BaseDoCollection<T> | SparkCollectFactory. createDoCollection(String name,
                  PCollectionImpl<S> parent,
                  DoFn<S,T> fn,
                  PType<T> type,
                  ParallelDoOptions options) | 
| <S,K,V> BaseDoTable<K,V> | SparkCollectFactory. createDoTable(String name,
             PCollectionImpl<S> parent,
             CombineFn<K,V> combineFn,
             DoFn<S,Pair<K,V>> fn,
             PTableType<K,V> type) | 
| <S,K,V> BaseDoTable<K,V> | SparkCollectFactory. createDoTable(String name,
             PCollectionImpl<S> parent,
             DoFn<S,Pair<K,V>> fn,
             PTableType<K,V> type,
             ParallelDoOptions options) | 
| Constructor and Description | 
|---|
| CrunchIterable(DoFn<S,T> fn,
              Iterator<S> input) | 
| FlatMapIndexFn(DoFn<S,T> fn,
              boolean convertInput,
              SparkRuntimeContext ctxt) | 
| FlatMapPairDoFn(DoFn<Pair<K,V>,T> fn,
               SparkRuntimeContext ctxt) | 
| PairFlatMapDoFn(DoFn<T,Pair<K,V>> fn,
               SparkRuntimeContext ctxt) | 
| Modifier and Type | Class and Description | 
|---|---|
| static class  | Aggregate.TopKCombineFn<K,V> | 
| static class  | Aggregate.TopKFn<K,V> | 
| Modifier and Type | Method and Description | 
|---|---|
| static <K,V,T> DoFn<Pair<K,Iterable<V>>,T> | DoFns. detach(DoFn<Pair<K,Iterable<V>>,T> reduceFn,
      PType<V> valueType)"Reduce" DoFn wrapper which detaches the values in the iterable, preventing the unexpected behaviour related to
 object reuse often observed when using Avro. | 
| Modifier and Type | Method and Description | 
|---|---|
| static <K,V,T> DoFn<Pair<K,Iterable<V>>,T> | DoFns. detach(DoFn<Pair<K,Iterable<V>>,T> reduceFn,
      PType<V> valueType)"Reduce" DoFn wrapper which detaches the values in the iterable, preventing the unexpected behaviour related to
 object reuse often observed when using Avro. | 
| static <K,V1,V2,U,V>  | SecondarySort. sortAndApply(PTable<K,Pair<V1,V2>> input,
            DoFn<Pair<K,Iterable<Pair<V1,V2>>>,Pair<U,V>> doFn,
            PTableType<U,V> ptype)Perform a secondary sort on the given  PTableinstance and then apply aDoFnto the resulting sorted data to yield an outputPTable<U, V>. | 
| static <K,V1,V2,U,V>  | SecondarySort. sortAndApply(PTable<K,Pair<V1,V2>> input,
            DoFn<Pair<K,Iterable<Pair<V1,V2>>>,Pair<U,V>> doFn,
            PTableType<U,V> ptype,
            int numReducers)Perform a secondary sort on the given  PTableinstance and then apply aDoFnto the resulting sorted data to yield an outputPTable<U, V>, using
 the given number of reducers. | 
| static <K,V1,V2,T>  | SecondarySort. sortAndApply(PTable<K,Pair<V1,V2>> input,
            DoFn<Pair<K,Iterable<Pair<V1,V2>>>,T> doFn,
            PType<T> ptype)Perform a secondary sort on the given  PTableinstance and then apply aDoFnto the resulting sorted data to yield an outputPCollection<T>. | 
| static <K,V1,V2,T>  | SecondarySort. sortAndApply(PTable<K,Pair<V1,V2>> input,
            DoFn<Pair<K,Iterable<Pair<V1,V2>>>,T> doFn,
            PType<T> ptype,
            int numReducers)Perform a secondary sort on the given  PTableinstance and then apply aDoFnto the resulting sorted data to yield an outputPCollection<T>, using
 the given number of reducers. | 
| Modifier and Type | Class and Description | 
|---|---|
| class  | FullOuterJoinFn<K,U,V>Used to perform the last step of an full outer join. | 
| class  | InnerJoinFn<K,U,V>Used to perform the last step of an inner join. | 
| class  | JoinFn<K,U,V>Represents a  DoFnfor performing joins. | 
| class  | LeftOuterJoinFn<K,U,V>Used to perform the last step of an left outer join. | 
| class  | RightOuterJoinFn<K,U,V>Used to perform the last step of an right outer join. | 
| Modifier and Type | Method and Description | 
|---|---|
| static <K,U,V,T> PCollection<T> | OneToManyJoin. oneToManyJoin(PTable<K,U> left,
             PTable<K,V> right,
             DoFn<Pair<U,Iterable<V>>,T> postProcessFn,
             PType<T> ptype)Performs a join on two tables, where the left table only contains a single
 value per key. | 
| static <K,U,V,T> PCollection<T> | OneToManyJoin. oneToManyJoin(PTable<K,U> left,
             PTable<K,V> right,
             DoFn<Pair<U,Iterable<V>>,T> postProcessFn,
             PType<T> ptype,
             int numReducers)Supports a user-specified number of reducers for the one-to-many join. | 
| Modifier and Type | Class and Description | 
|---|---|
| static class  | SortFns.AvroGenericFn<V extends Tuple>Pulls a composite set of keys from an Avro  GenericRecordinstance. | 
| static class  | SortFns.SingleKeyFn<V extends Tuple,K>Extracts a single indexed key from a  Tupleinstance. | 
| static class  | SortFns.TupleKeyFn<V extends Tuple,K extends Tuple>Extracts a composite key from a  Tupleinstance. | 
| Modifier and Type | Class and Description | 
|---|---|
| static class  | PGroupedTableType.PairIterableMapFn<K,V> | 
| Modifier and Type | Method and Description | 
|---|---|
| static <M extends com.google.protobuf.Message>  | Protos. lineParser(String sep,
          Class<M> msgClass) | 
| Constructor and Description | 
|---|
| DelegatingReadableData(ReadableData<S> delegate,
                      DoFn<S,T> fn) | 
| DoFnIterator(Iterator<S> iter,
            DoFn<S,T> fn) | 
Copyright © 2015 The Apache Software Foundation. All Rights Reserved.