| Package | Description | 
|---|---|
| org.apache.crunch | Client-facing API and core abstractions. | 
| org.apache.crunch.contrib.text | |
| org.apache.crunch.examples | Example applications demonstrating various aspects of Crunch. | 
| org.apache.crunch.impl.dist | |
| org.apache.crunch.impl.dist.collect | |
| org.apache.crunch.impl.mem | In-memory Pipeline implementation for rapid prototyping and testing. | 
| org.apache.crunch.impl.spark | |
| org.apache.crunch.impl.spark.collect | |
| org.apache.crunch.lambda | Alternative Crunch API using Java 8 features to allow construction of pipelines using lambda functions and method
 references. | 
| org.apache.crunch.lib | Joining, sorting, aggregating, and other commonly used functionality. | 
| org.apache.crunch.lib.join | Inner and outer joins on collections. | 
| org.apache.crunch.util | An assorted set of utilities. | 
| Modifier and Type | Method and Description | 
|---|---|
| PTable<K,V> | PTable. bottom(int count)Returns a PTable made up of the pairs in this PTable with the smallest
 value field. | 
| <K> PTable<K,S> | PCollection. by(MapFn<S,K> extractKeyFn,
  PType<K> keyType)Apply the given map function to each element of this instance in order to
 create a  PTable. | 
| <K> PTable<K,S> | PCollection. by(String name,
  MapFn<S,K> extractKeyFn,
  PType<K> keyType)Apply the given map function to each element of this instance in order to
 create a  PTable. | 
| PTable<K,V> | PTable. cache() | 
| PTable<K,V> | PTable. cache(CachingOptions options) | 
| <U> PTable<K,Pair<Collection<V>,Collection<U>>> | PTable. cogroup(PTable<K,U> other)Co-group operation with the given table on common keys. | 
| PTable<K,Collection<V>> | PTable. collectValues()Aggregate all of the values with the same key into a single key-value pair
 in the returned PTable. | 
| PTable<K,V> | PGroupedTable. combineValues(Aggregator<V> aggregator)Combine the values in each group using the given  Aggregator. | 
| PTable<K,V> | PGroupedTable. combineValues(Aggregator<V> combineAggregator,
             Aggregator<V> reduceAggregator)Combine and reduces the values in each group using the given  Aggregatorinstances. | 
| PTable<K,V> | PGroupedTable. combineValues(CombineFn<K,V> combineFn)Combines the values of this grouping using the given  CombineFn. | 
| PTable<K,V> | PGroupedTable. combineValues(CombineFn<K,V> combineFn,
             CombineFn<K,V> reduceFn)Combines and reduces the values of this grouping using the given  CombineFninstances. | 
| PTable<S,Long> | PCollection. count()Returns a  PTableinstance that contains the counts of each unique
 element of this PCollection. | 
| <K,V> PTable<K,V> | Pipeline. create(Iterable<Pair<K,V>> contents,
      PTableType<K,V> ptype)Creates a  PTablecontaining the values found in the givenIterableusing an implementation-specific distribution mechanism. | 
| <K,V> PTable<K,V> | Pipeline. create(Iterable<Pair<K,V>> contents,
      PTableType<K,V> ptype,
      CreateOptions options)Creates a  PTablecontaining the values found in the givenIterableusing an implementation-specific distribution mechanism. | 
| <K,V> PTable<K,V> | Pipeline. emptyPTable(PTableType<K,V> ptype)Creates an empty  PTableof the givenPTable Type. | 
| PTable<K,V> | PTable. filter(FilterFn<Pair<K,V>> filterFn)Apply the given filter function to this instance and return the resulting
  PTable. | 
| PTable<K,V> | PTable. filter(String name,
      FilterFn<Pair<K,V>> filterFn)Apply the given filter function to this instance and return the resulting
  PTable. | 
| <U> PTable<K,Pair<V,U>> | PTable. join(PTable<K,U> other)Perform an inner join on this table and the one passed in as an argument on
 their common keys. | 
| <K2> PTable<K2,V> | PTable. mapKeys(MapFn<K,K2> mapFn,
       PType<K2> ptype)Returns a  PTablethat has the same values as this instance, but
 uses the given function to map the keys. | 
| <K2> PTable<K2,V> | PTable. mapKeys(String name,
       MapFn<K,K2> mapFn,
       PType<K2> ptype)Returns a  PTablethat has the same values as this instance, but
 uses the given function to map the keys. | 
| <U> PTable<K,U> | PGroupedTable. mapValues(MapFn<Iterable<V>,U> mapFn,
         PType<U> ptype)Maps the  Iterable<V>elements of each record to a new type. | 
| <U> PTable<K,U> | PTable. mapValues(MapFn<V,U> mapFn,
         PType<U> ptype)Returns a  PTablethat has the same keys as this instance, but
 uses the given function to map the values. | 
| <U> PTable<K,U> | PGroupedTable. mapValues(String name,
         MapFn<Iterable<V>,U> mapFn,
         PType<U> ptype)Maps the  Iterable<V>elements of each record to a new type. | 
| <U> PTable<K,U> | PTable. mapValues(String name,
         MapFn<V,U> mapFn,
         PType<U> ptype)Returns a  PTablethat has the same keys as this instance, but
 uses the given function to map the values. | 
| <K,V> PTable<K,V> | PCollection. parallelDo(DoFn<S,Pair<K,V>> doFn,
          PTableType<K,V> type)Similar to the other  parallelDoinstance, but returns aPTableinstance instead of aPCollection. | 
| <K,V> PTable<K,V> | PCollection. parallelDo(String name,
          DoFn<S,Pair<K,V>> doFn,
          PTableType<K,V> type)Similar to the other  parallelDoinstance, but returns aPTableinstance instead of aPCollection. | 
| <K,V> PTable<K,V> | PCollection. parallelDo(String name,
          DoFn<S,Pair<K,V>> doFn,
          PTableType<K,V> type,
          ParallelDoOptions options)Similar to the other  parallelDoinstance, but returns aPTableinstance instead of aPCollection. | 
| <K,V> PTable<K,V> | Pipeline. read(TableSource<K,V> tableSource)A version of the read method for  TableSourceinstances that map toPTables. | 
| <K,V> PTable<K,V> | Pipeline. read(TableSource<K,V> tableSource,
    String named)A version of the read method for  TableSourceinstances that map toPTables. | 
| PTable<K,V> | PTable. top(int count)Returns a PTable made up of the pairs in this PTable with the largest value
 field. | 
| PTable<K,V> | PGroupedTable. ungroup()Convert this grouping back into a multimap. | 
| PTable<K,V> | PTable. union(PTable<K,V>... others)Returns a  PTableinstance that acts as the union of thisPTableand the inputPTables. | 
| PTable<K,V> | PTable. union(PTable<K,V> other)Returns a  PTableinstance that acts as the union of thisPTableand the otherPTables. | 
| <K,V> PTable<K,V> | Pipeline. unionTables(List<PTable<K,V>> tables) | 
| PTable<K,V> | PTable. write(Target target)Writes this  PTableto the givenTarget. | 
| PTable<K,V> | PTable. write(Target target,
     Target.WriteMode writeMode)Writes this  PTableto the givenTarget, using the
 givenTarget.WriteModeto handle existing targets. | 
| Modifier and Type | Method and Description | 
|---|---|
| <U> PTable<K,Pair<Collection<V>,Collection<U>>> | PTable. cogroup(PTable<K,U> other)Co-group operation with the given table on common keys. | 
| <U> PTable<K,Pair<V,U>> | PTable. join(PTable<K,U> other)Perform an inner join on this table and the one passed in as an argument on
 their common keys. | 
| PTable<K,V> | PTable. union(PTable<K,V>... others)Returns a  PTableinstance that acts as the union of thisPTableand the inputPTables. | 
| PTable<K,V> | PTable. union(PTable<K,V> other)Returns a  PTableinstance that acts as the union of thisPTableand the otherPTables. | 
| Modifier and Type | Method and Description | 
|---|---|
| <K,V> PTable<K,V> | Pipeline. unionTables(List<PTable<K,V>> tables) | 
| Modifier and Type | Method and Description | 
|---|---|
| static <K,V> PTable<K,V> | Parse. parseTable(String groupName,
          PCollection<String> input,
          Extractor<Pair<K,V>> extractor)Parses the lines of the input  PCollection<String>and returns aPTable<K, V>using
 the givenExtractor<Pair<K, V>>. | 
| static <K,V> PTable<K,V> | Parse. parseTable(String groupName,
          PCollection<String> input,
          PTypeFamily ptf,
          Extractor<Pair<K,V>> extractor)Parses the lines of the input  PCollection<String>and returns aPTable<K, V>using
 the givenExtractor<Pair<K, V>>that uses the givenPTypeFamily. | 
| Modifier and Type | Method and Description | 
|---|---|
| PTable<String,String> | WordAggregationHBase. extractText(PTable<org.apache.hadoop.hbase.io.ImmutableBytesWritable,org.apache.hadoop.hbase.client.Result> words)Extract information from hbase | 
| Modifier and Type | Method and Description | 
|---|---|
| PCollection<org.apache.hadoop.hbase.client.Put> | WordAggregationHBase. createPut(PTable<String,String> extractedText)Create puts in order to insert them in hbase. | 
| PTable<String,String> | WordAggregationHBase. extractText(PTable<org.apache.hadoop.hbase.io.ImmutableBytesWritable,org.apache.hadoop.hbase.client.Result> words)Extract information from hbase | 
| Modifier and Type | Method and Description | 
|---|---|
| <K,V> PTable<K,V> | DistributedPipeline. create(Iterable<Pair<K,V>> contents,
      PTableType<K,V> ptype) | 
| <K,V> PTable<K,V> | DistributedPipeline. create(Iterable<Pair<K,V>> contents,
      PTableType<K,V> ptype,
      CreateOptions options) | 
| <K,V> PTable<K,V> | DistributedPipeline. emptyPTable(PTableType<K,V> ptype) | 
| <K,V> PTable<K,V> | DistributedPipeline. read(TableSource<K,V> source) | 
| <K,V> PTable<K,V> | DistributedPipeline. read(TableSource<K,V> source,
    String named) | 
| <K,V> PTable<K,V> | DistributedPipeline. unionTables(List<PTable<K,V>> tables) | 
| Modifier and Type | Method and Description | 
|---|---|
| <K,V> PTable<K,V> | DistributedPipeline. unionTables(List<PTable<K,V>> tables) | 
| Modifier and Type | Class and Description | 
|---|---|
| class  | BaseDoTable<K,V> | 
| class  | BaseInputTable<K,V> | 
| class  | BaseUnionTable<K,V> | 
| class  | EmptyPTable<K,V> | 
| class  | PTableBase<K,V> | 
| Modifier and Type | Method and Description | 
|---|---|
| PTable<K,V> | PTableBase. bottom(int count) | 
| <K> PTable<K,S> | PCollectionImpl. by(MapFn<S,K> mapFn,
  PType<K> keyType) | 
| <K> PTable<K,S> | PCollectionImpl. by(String name,
  MapFn<S,K> mapFn,
  PType<K> keyType) | 
| PTable<K,V> | PTableBase. cache() | 
| PTable<K,V> | PTableBase. cache(CachingOptions options) | 
| <U> PTable<K,Pair<Collection<V>,Collection<U>>> | PTableBase. cogroup(PTable<K,U> other) | 
| PTable<K,Collection<V>> | PTableBase. collectValues() | 
| PTable<K,V> | BaseGroupedTable. combineValues(Aggregator<V> agg) | 
| PTable<K,V> | BaseGroupedTable. combineValues(Aggregator<V> combineAgg,
             Aggregator<V> reduceAgg) | 
| PTable<K,V> | BaseGroupedTable. combineValues(CombineFn<K,V> combineFn) | 
| PTable<K,V> | BaseGroupedTable. combineValues(CombineFn<K,V> combineFn,
             CombineFn<K,V> reduceFn) | 
| PTable<S,Long> | PCollectionImpl. count() | 
| <K,V> PTable<K,V> | PCollectionFactory. createUnionTable(List<PTableBase<K,V>> internal) | 
| PTable<K,V> | PTableBase. filter(FilterFn<Pair<K,V>> filterFn) | 
| PTable<K,V> | PTableBase. filter(String name,
      FilterFn<Pair<K,V>> filterFn) | 
| <U> PTable<K,Pair<V,U>> | PTableBase. join(PTable<K,U> other) | 
| <K2> PTable<K2,V> | PTableBase. mapKeys(MapFn<K,K2> mapFn,
       PType<K2> ptype) | 
| <K2> PTable<K2,V> | PTableBase. mapKeys(String name,
       MapFn<K,K2> mapFn,
       PType<K2> ptype) | 
| <U> PTable<K,U> | BaseGroupedTable. mapValues(MapFn<Iterable<V>,U> mapFn,
         PType<U> ptype) | 
| <U> PTable<K,U> | PTableBase. mapValues(MapFn<V,U> mapFn,
         PType<U> ptype) | 
| <U> PTable<K,U> | BaseGroupedTable. mapValues(String name,
         MapFn<Iterable<V>,U> mapFn,
         PType<U> ptype) | 
| <U> PTable<K,U> | PTableBase. mapValues(String name,
         MapFn<V,U> mapFn,
         PType<U> ptype) | 
| <K,V> PTable<K,V> | PCollectionImpl. parallelDo(DoFn<S,Pair<K,V>> fn,
          PTableType<K,V> type) | 
| <K,V> PTable<K,V> | PCollectionImpl. parallelDo(String name,
          DoFn<S,Pair<K,V>> fn,
          PTableType<K,V> type) | 
| <K,V> PTable<K,V> | PCollectionImpl. parallelDo(String name,
          DoFn<S,Pair<K,V>> fn,
          PTableType<K,V> type,
          ParallelDoOptions options) | 
| PTable<K,V> | PTableBase. top(int count) | 
| PTable<K,V> | BaseGroupedTable. ungroup() | 
| PTable<K,V> | PTableBase. union(PTable<K,V>... others) | 
| PTable<K,V> | PTableBase. union(PTable<K,V> other) | 
| PTable<K,V> | PTableBase. write(Target target) | 
| PTable<K,V> | PTableBase. write(Target target,
     Target.WriteMode writeMode) | 
| Modifier and Type | Method and Description | 
|---|---|
| <U> PTable<K,Pair<Collection<V>,Collection<U>>> | PTableBase. cogroup(PTable<K,U> other) | 
| <U> PTable<K,Pair<V,U>> | PTableBase. join(PTable<K,U> other) | 
| PTable<K,V> | PTableBase. union(PTable<K,V>... others) | 
| PTable<K,V> | PTableBase. union(PTable<K,V> other) | 
| Modifier and Type | Method and Description | 
|---|---|
| <K,V> PTable<K,V> | MemPipeline. create(Iterable<Pair<K,V>> contents,
      PTableType<K,V> ptype) | 
| <K,V> PTable<K,V> | MemPipeline. create(Iterable<Pair<K,V>> contents,
      PTableType<K,V> ptype,
      CreateOptions options) | 
| <K,V> PTable<K,V> | MemPipeline. emptyPTable(PTableType<K,V> ptype) | 
| <K,V> PTable<K,V> | MemPipeline. read(TableSource<K,V> source) | 
| <K,V> PTable<K,V> | MemPipeline. read(TableSource<K,V> source,
    String named) | 
| static <S,T> PTable<S,T> | MemPipeline. tableOf(Iterable<Pair<S,T>> pairs) | 
| static <S,T> PTable<S,T> | MemPipeline. tableOf(S s,
       T t,
       Object... more) | 
| static <S,T> PTable<S,T> | MemPipeline. typedTableOf(PTableType<S,T> ptype,
            Iterable<Pair<S,T>> pairs) | 
| static <S,T> PTable<S,T> | MemPipeline. typedTableOf(PTableType<S,T> ptype,
            S s,
            T t,
            Object... more) | 
| <K,V> PTable<K,V> | MemPipeline. unionTables(List<PTable<K,V>> tables) | 
| Modifier and Type | Method and Description | 
|---|---|
| <K,V> PTable<K,V> | MemPipeline. unionTables(List<PTable<K,V>> tables) | 
| Modifier and Type | Method and Description | 
|---|---|
| <K,V> PTable<K,V> | SparkPipeline. create(Iterable<Pair<K,V>> contents,
      PTableType<K,V> ptype,
      CreateOptions options) | 
| <K,V> PTable<K,V> | SparkPipeline. emptyPTable(PTableType<K,V> ptype) | 
| Modifier and Type | Class and Description | 
|---|---|
| class  | CreatedTable<K,V>Represents a Spark-based PTable that was created from a Java  Iterableof
 key-value pairs. | 
| class  | DoTable<K,V> | 
| class  | InputTable<K,V> | 
| class  | UnionTable<K,V> | 
| Modifier and Type | Method and Description | 
|---|---|
| <K,V> PTable<K,V> | SparkCollectFactory. createUnionTable(List<PTableBase<K,V>> internal) | 
| Modifier and Type | Method and Description | 
|---|---|
| PTable<K,V> | LTable. underlying()Get the underlying  PTablefor this LCollection | 
| Modifier and Type | Method and Description | 
|---|---|
| <K,V> LTable<K,V> | LCollectionFactory. wrap(PTable<K,V> collection)Wrap a PTable into an LTable | 
| static <K,V> LTable<K,V> | Lambda. wrap(PTable<K,V> collection) | 
| Modifier and Type | Method and Description | 
|---|---|
| static <K,V> PTable<K,V> | PTables. asPTable(PCollection<Pair<K,V>> pcollect)Convert the given  PCollection<Pair<K, V>>to aPTable<K, V>. | 
| static <K,U,V> PTable<K,TupleN> | Cogroup. cogroup(int numReducers,
       PTable<K,?> first,
       PTable<K,?>... rest)Co-groups an arbitrary number of  PTablearguments with a user-specified degree of parallelism
 (a.k.a, number of reducers.) The largest table should come last in the ordering. | 
| static <K,U,V> PTable<K,Pair<Collection<U>,Collection<V>>> | Cogroup. cogroup(int numReducers,
       PTable<K,U> left,
       PTable<K,V> right)Co-groups the two  PTablearguments with a user-specified degree of parallelism (a.k.a, number of
 reducers.) | 
| static <K,V1,V2,V3> | Cogroup. cogroup(int numReducers,
       PTable<K,V1> first,
       PTable<K,V2> second,
       PTable<K,V3> third)Co-groups the three  PTablearguments with a user-specified degree of parallelism (a.k.a, number of
 reducers.) | 
| static <K,V1,V2,V3,V4> | Cogroup. cogroup(int numReducers,
       PTable<K,V1> first,
       PTable<K,V2> second,
       PTable<K,V3> third,
       PTable<K,V4> fourth)Co-groups the three  PTablearguments with a user-specified degree of parallelism (a.k.a, number of
 reducers.) | 
| static <K> PTable<K,TupleN> | Cogroup. cogroup(PTable<K,?> first,
       PTable<K,?>... rest)Co-groups an arbitrary number of  PTablearguments. | 
| static <K,U,V> PTable<K,Pair<Collection<U>,Collection<V>>> | Cogroup. cogroup(PTable<K,U> left,
       PTable<K,V> right)Co-groups the two  PTablearguments. | 
| static <K,V1,V2,V3> | Cogroup. cogroup(PTable<K,V1> first,
       PTable<K,V2> second,
       PTable<K,V3> third)Co-groups the three  PTablearguments. | 
| static <K,V1,V2,V3,V4> | Cogroup. cogroup(PTable<K,V1> first,
       PTable<K,V2> second,
       PTable<K,V3> third,
       PTable<K,V4> fourth)Co-groups the three  PTablearguments. | 
| static <K,V> PTable<K,Collection<V>> | Aggregate. collectValues(PTable<K,V> collect) | 
| static <S> PTable<S,Long> | Aggregate. count(PCollection<S> collect)Returns a  PTablethat contains the unique elements of this collection mapped to a count
 of their occurrences. | 
| static <S> PTable<S,Long> | Aggregate. count(PCollection<S> collect,
     int numPartitions)Returns a  PTablethat contains the unique elements of this collection mapped to a count
 of their occurrences. | 
| static <K1,K2,U,V> | Cartesian. cross(PTable<K1,U> left,
     PTable<K2,V> right)Performs a full cross join on the specified  PTables (using the same
 strategy as Pig's CROSS operator). | 
| static <K1,K2,U,V> | Cartesian. cross(PTable<K1,U> left,
     PTable<K2,V> right,
     int parallelism)Performs a full cross join on the specified  PTables (using the same
 strategy as Pig's CROSS operator). | 
| static <K,V> PTable<K,V> | Distinct. distinct(PTable<K,V> input)A  PTable<K, V>analogue of thedistinctfunction. | 
| static <K,V> PTable<K,V> | Distinct. distinct(PTable<K,V> input,
        int flushEvery)A  PTable<K, V>analogue of thedistinctfunction. | 
| static <K,V extends Number> | Quantiles. distributed(PTable<K,V> table,
           double p1,
           double... pn)Calculate a set of quantiles for each key in a numerically-valued table. | 
| static <K,U,V> PTable<K,Pair<U,V>> | Join. fullJoin(PTable<K,U> left,
        PTable<K,V> right)Performs a full outer join on the specified  PTables. | 
| static <X> PTable<X,Long> | TopList. globalToplist(PCollection<X> input)Create a list of unique items in the input collection with their count, sorted descending by their frequency. | 
| static <K,V extends Comparable> | Quantiles. inMemory(PTable<K,V> table,
        double p1,
        double... pn)Calculate a set of quantiles for each key in a numerically-valued table. | 
| static <K,U,V> PTable<K,Pair<U,V>> | Join. innerJoin(PTable<K,U> left,
         PTable<K,V> right)Performs an inner join on the specified  PTables. | 
| static <K,U,V> PTable<K,Pair<U,V>> | Join. join(PTable<K,U> left,
    PTable<K,V> right)Performs an inner join on the specified  PTables. | 
| static <K,U,V> PTable<K,Pair<U,V>> | Join. leftJoin(PTable<K,U> left,
        PTable<K,V> right)Performs a left outer join on the specified  PTables. | 
| static <K1,V1,K2 extends org.apache.hadoop.io.Writable,V2 extends org.apache.hadoop.io.Writable> | Mapreduce. map(PTable<K1,V1> input,
   Class<? extends org.apache.hadoop.mapreduce.Mapper<K1,V1,K2,V2>> mapperClass,
   Class<K2> keyClass,
   Class<V2> valueClass) | 
| static <K1,V1,K2 extends org.apache.hadoop.io.Writable,V2 extends org.apache.hadoop.io.Writable> | Mapred. map(PTable<K1,V1> input,
   Class<? extends org.apache.hadoop.mapred.Mapper<K1,V1,K2,V2>> mapperClass,
   Class<K2> keyClass,
   Class<V2> valueClass) | 
| static <K1,K2,V> PTable<K2,V> | PTables. mapKeys(PTable<K1,V> ptable,
       MapFn<K1,K2> mapFn,
       PType<K2> ptype)Maps a  PTable<K1, V>to aPTable<K2, V>using the givenMapFn<K1, K2>on
 the keys of thePTable. | 
| static <K1,K2,V> PTable<K2,V> | PTables. mapKeys(String name,
       PTable<K1,V> ptable,
       MapFn<K1,K2> mapFn,
       PType<K2> ptype)Maps a  PTable<K1, V>to aPTable<K2, V>using the givenMapFn<K1, K2>on
 the keys of thePTable. | 
| static <K,U,V> PTable<K,V> | PTables. mapValues(PGroupedTable<K,U> ptable,
         MapFn<Iterable<U>,V> mapFn,
         PType<V> ptype)An analogue of the  mapValuesfunction forPGroupedTable<K, U>collections. | 
| static <K,U,V> PTable<K,V> | PTables. mapValues(PTable<K,U> ptable,
         MapFn<U,V> mapFn,
         PType<V> ptype)Maps a  PTable<K, U>to aPTable<K, V>using the givenMapFn<U, V>on
 the values of thePTable. | 
| static <K,U,V> PTable<K,V> | PTables. mapValues(String name,
         PGroupedTable<K,U> ptable,
         MapFn<Iterable<U>,V> mapFn,
         PType<V> ptype)An analogue of the  mapValuesfunction forPGroupedTable<K, U>collections. | 
| static <K,U,V> PTable<K,V> | PTables. mapValues(String name,
         PTable<K,U> ptable,
         MapFn<U,V> mapFn,
         PType<V> ptype)Maps a  PTable<K, U>to aPTable<K, V>using the givenMapFn<U, V>on
 the values of thePTable. | 
| static <K,V extends Number> | Average. meanValue(PTable<K,V> table)Calculate the mean average value by key for a table with numeric values. | 
| static <K> PTable<K,Long> | TopList. negateCounts(PTable<K,Long> table)When creating toplists, it is often required to sort by count descending. | 
| static <K1,V1,K2 extends org.apache.hadoop.io.Writable,V2 extends org.apache.hadoop.io.Writable> | Mapreduce. reduce(PGroupedTable<K1,V1> input,
      Class<? extends org.apache.hadoop.mapreduce.Reducer<K1,V1,K2,V2>> reducerClass,
      Class<K2> keyClass,
      Class<V2> valueClass) | 
| static <K1,V1,K2 extends org.apache.hadoop.io.Writable,V2 extends org.apache.hadoop.io.Writable> | Mapred. reduce(PGroupedTable<K1,V1> input,
      Class<? extends org.apache.hadoop.mapred.Reducer<K1,V1,K2,V2>> reducerClass,
      Class<K2> keyClass,
      Class<V2> valueClass) | 
| static <K,U,V> PTable<K,Pair<U,V>> | Join. rightJoin(PTable<K,U> left,
         PTable<K,V> right)Performs a right outer join on the specified  PTables. | 
| static <K,V> PTable<K,V> | Sample. sample(PTable<K,V> input,
      double probability)A  PTable<K, V>analogue of thesamplefunction. | 
| static <K,V> PTable<K,V> | Sample. sample(PTable<K,V> input,
      Long seed,
      double probability)A  PTable<K, V>analogue of thesamplefunction, with the seed argument
 exposed for testing purposes. | 
| static <K,V> PTable<K,V> | Sort. sort(PTable<K,V> table)Sorts the  PTableusing the natural ordering of its keys in ascending order. | 
| static <K,V> PTable<K,V> | Sort. sort(PTable<K,V> table,
    int numReducers,
    Sort.Order key)Sorts the  PTableusing the natural ordering of its keys in the
 order specified with a client-specified number of reducers. | 
| static <K,V> PTable<K,V> | Sort. sort(PTable<K,V> table,
    Sort.Order key)Sorts the  PTableusing the natural ordering of its keys with the givenOrder. | 
| static <K,V1,V2,U,V> | SecondarySort. sortAndApply(PTable<K,Pair<V1,V2>> input,
            DoFn<Pair<K,Iterable<Pair<V1,V2>>>,Pair<U,V>> doFn,
            PTableType<U,V> ptype)Perform a secondary sort on the given  PTableinstance and then apply aDoFnto the resulting sorted data to yield an outputPTable<U, V>. | 
| static <K,V1,V2,U,V> | SecondarySort. sortAndApply(PTable<K,Pair<V1,V2>> input,
            DoFn<Pair<K,Iterable<Pair<V1,V2>>>,Pair<U,V>> doFn,
            PTableType<U,V> ptype,
            int numReducers)Perform a secondary sort on the given  PTableinstance and then apply aDoFnto the resulting sorted data to yield an outputPTable<U, V>, using
 the given number of reducers. | 
| static <K,V> PTable<V,K> | PTables. swapKeyValue(PTable<K,V> table)Swap the key and value part of a table. | 
| static <K,V> PTable<K,V> | Aggregate. top(PTable<K,V> ptable,
   int limit,
   boolean maximize)Selects the top N pairs from the given table, with sorting being performed on the values (i.e. | 
| static <X,Y> PTable<X,Collection<Pair<Long,Y>>> | TopList. topNYbyX(PTable<X,Y> input,
        int n)Create a top-list of elements in the provided PTable, categorised by the key of the input table and using the count
 of the value part of the input table. | 
| Modifier and Type | Method and Description | 
|---|---|
| static <K,U,V> PTable<K,TupleN> | Cogroup. cogroup(int numReducers,
       PTable<K,?> first,
       PTable<K,?>... rest)Co-groups an arbitrary number of  PTablearguments with a user-specified degree of parallelism
 (a.k.a, number of reducers.) The largest table should come last in the ordering. | 
| static <K,U,V> PTable<K,TupleN> | Cogroup. cogroup(int numReducers,
       PTable<K,?> first,
       PTable<K,?>... rest)Co-groups an arbitrary number of  PTablearguments with a user-specified degree of parallelism
 (a.k.a, number of reducers.) The largest table should come last in the ordering. | 
| static <K,U,V> PTable<K,Pair<Collection<U>,Collection<V>>> | Cogroup. cogroup(int numReducers,
       PTable<K,U> left,
       PTable<K,V> right)Co-groups the two  PTablearguments with a user-specified degree of parallelism (a.k.a, number of
 reducers.) | 
| static <K,U,V> PTable<K,Pair<Collection<U>,Collection<V>>> | Cogroup. cogroup(int numReducers,
       PTable<K,U> left,
       PTable<K,V> right)Co-groups the two  PTablearguments with a user-specified degree of parallelism (a.k.a, number of
 reducers.) | 
| static <K,V1,V2,V3> | Cogroup. cogroup(int numReducers,
       PTable<K,V1> first,
       PTable<K,V2> second,
       PTable<K,V3> third)Co-groups the three  PTablearguments with a user-specified degree of parallelism (a.k.a, number of
 reducers.) | 
| static <K,V1,V2,V3> | Cogroup. cogroup(int numReducers,
       PTable<K,V1> first,
       PTable<K,V2> second,
       PTable<K,V3> third)Co-groups the three  PTablearguments with a user-specified degree of parallelism (a.k.a, number of
 reducers.) | 
| static <K,V1,V2,V3> | Cogroup. cogroup(int numReducers,
       PTable<K,V1> first,
       PTable<K,V2> second,
       PTable<K,V3> third)Co-groups the three  PTablearguments with a user-specified degree of parallelism (a.k.a, number of
 reducers.) | 
| static <K,V1,V2,V3,V4> | Cogroup. cogroup(int numReducers,
       PTable<K,V1> first,
       PTable<K,V2> second,
       PTable<K,V3> third,
       PTable<K,V4> fourth)Co-groups the three  PTablearguments with a user-specified degree of parallelism (a.k.a, number of
 reducers.) | 
| static <K,V1,V2,V3,V4> | Cogroup. cogroup(int numReducers,
       PTable<K,V1> first,
       PTable<K,V2> second,
       PTable<K,V3> third,
       PTable<K,V4> fourth)Co-groups the three  PTablearguments with a user-specified degree of parallelism (a.k.a, number of
 reducers.) | 
| static <K,V1,V2,V3,V4> | Cogroup. cogroup(int numReducers,
       PTable<K,V1> first,
       PTable<K,V2> second,
       PTable<K,V3> third,
       PTable<K,V4> fourth)Co-groups the three  PTablearguments with a user-specified degree of parallelism (a.k.a, number of
 reducers.) | 
| static <K,V1,V2,V3,V4> | Cogroup. cogroup(int numReducers,
       PTable<K,V1> first,
       PTable<K,V2> second,
       PTable<K,V3> third,
       PTable<K,V4> fourth)Co-groups the three  PTablearguments with a user-specified degree of parallelism (a.k.a, number of
 reducers.) | 
| static <K> PTable<K,TupleN> | Cogroup. cogroup(PTable<K,?> first,
       PTable<K,?>... rest)Co-groups an arbitrary number of  PTablearguments. | 
| static <K> PTable<K,TupleN> | Cogroup. cogroup(PTable<K,?> first,
       PTable<K,?>... rest)Co-groups an arbitrary number of  PTablearguments. | 
| static <K,U,V> PTable<K,Pair<Collection<U>,Collection<V>>> | Cogroup. cogroup(PTable<K,U> left,
       PTable<K,V> right)Co-groups the two  PTablearguments. | 
| static <K,U,V> PTable<K,Pair<Collection<U>,Collection<V>>> | Cogroup. cogroup(PTable<K,U> left,
       PTable<K,V> right)Co-groups the two  PTablearguments. | 
| static <K,V1,V2,V3> | Cogroup. cogroup(PTable<K,V1> first,
       PTable<K,V2> second,
       PTable<K,V3> third)Co-groups the three  PTablearguments. | 
| static <K,V1,V2,V3> | Cogroup. cogroup(PTable<K,V1> first,
       PTable<K,V2> second,
       PTable<K,V3> third)Co-groups the three  PTablearguments. | 
| static <K,V1,V2,V3> | Cogroup. cogroup(PTable<K,V1> first,
       PTable<K,V2> second,
       PTable<K,V3> third)Co-groups the three  PTablearguments. | 
| static <K,V1,V2,V3,V4> | Cogroup. cogroup(PTable<K,V1> first,
       PTable<K,V2> second,
       PTable<K,V3> third,
       PTable<K,V4> fourth)Co-groups the three  PTablearguments. | 
| static <K,V1,V2,V3,V4> | Cogroup. cogroup(PTable<K,V1> first,
       PTable<K,V2> second,
       PTable<K,V3> third,
       PTable<K,V4> fourth)Co-groups the three  PTablearguments. | 
| static <K,V1,V2,V3,V4> | Cogroup. cogroup(PTable<K,V1> first,
       PTable<K,V2> second,
       PTable<K,V3> third,
       PTable<K,V4> fourth)Co-groups the three  PTablearguments. | 
| static <K,V1,V2,V3,V4> | Cogroup. cogroup(PTable<K,V1> first,
       PTable<K,V2> second,
       PTable<K,V3> third,
       PTable<K,V4> fourth)Co-groups the three  PTablearguments. | 
| static <K,V> PTable<K,Collection<V>> | Aggregate. collectValues(PTable<K,V> collect) | 
| static <K1,K2,U,V> | Cartesian. cross(PTable<K1,U> left,
     PTable<K2,V> right)Performs a full cross join on the specified  PTables (using the same
 strategy as Pig's CROSS operator). | 
| static <K1,K2,U,V> | Cartesian. cross(PTable<K1,U> left,
     PTable<K2,V> right)Performs a full cross join on the specified  PTables (using the same
 strategy as Pig's CROSS operator). | 
| static <K1,K2,U,V> | Cartesian. cross(PTable<K1,U> left,
     PTable<K2,V> right,
     int parallelism)Performs a full cross join on the specified  PTables (using the same
 strategy as Pig's CROSS operator). | 
| static <K1,K2,U,V> | Cartesian. cross(PTable<K1,U> left,
     PTable<K2,V> right,
     int parallelism)Performs a full cross join on the specified  PTables (using the same
 strategy as Pig's CROSS operator). | 
| static <K,V> PTable<K,V> | Distinct. distinct(PTable<K,V> input)A  PTable<K, V>analogue of thedistinctfunction. | 
| static <K,V> PTable<K,V> | Distinct. distinct(PTable<K,V> input,
        int flushEvery)A  PTable<K, V>analogue of thedistinctfunction. | 
| static <K,V extends Number> | Quantiles. distributed(PTable<K,V> table,
           double p1,
           double... pn)Calculate a set of quantiles for each key in a numerically-valued table. | 
| static <K,U,V> PTable<K,Pair<U,V>> | Join. fullJoin(PTable<K,U> left,
        PTable<K,V> right)Performs a full outer join on the specified  PTables. | 
| static <K,U,V> PTable<K,Pair<U,V>> | Join. fullJoin(PTable<K,U> left,
        PTable<K,V> right)Performs a full outer join on the specified  PTables. | 
| static <T,N extends Number> | Sample. groupedWeightedReservoirSample(PTable<Integer,Pair<T,N>> input,
                              int[] sampleSizes)The most general purpose of the weighted reservoir sampling patterns that allows us to choose
 a random sample of elements for each of N input groups. | 
| static <T,N extends Number> | Sample. groupedWeightedReservoirSample(PTable<Integer,Pair<T,N>> input,
                              int[] sampleSizes,
                              Long seed)Same as the other groupedWeightedReservoirSample method, but include a seed for testing
 purposes. | 
| static <K,V extends Comparable> | Quantiles. inMemory(PTable<K,V> table,
        double p1,
        double... pn)Calculate a set of quantiles for each key in a numerically-valued table. | 
| static <K,U,V> PTable<K,Pair<U,V>> | Join. innerJoin(PTable<K,U> left,
         PTable<K,V> right)Performs an inner join on the specified  PTables. | 
| static <K,U,V> PTable<K,Pair<U,V>> | Join. innerJoin(PTable<K,U> left,
         PTable<K,V> right)Performs an inner join on the specified  PTables. | 
| static <K,U,V> PTable<K,Pair<U,V>> | Join. join(PTable<K,U> left,
    PTable<K,V> right)Performs an inner join on the specified  PTables. | 
| static <K,U,V> PTable<K,Pair<U,V>> | Join. join(PTable<K,U> left,
    PTable<K,V> right)Performs an inner join on the specified  PTables. | 
| static <K,V> PCollection<K> | PTables. keys(PTable<K,V> ptable)Extract the keys from the given  PTable<K, V>as aPCollection<K>. | 
| static <K,U,V> PTable<K,Pair<U,V>> | Join. leftJoin(PTable<K,U> left,
        PTable<K,V> right)Performs a left outer join on the specified  PTables. | 
| static <K,U,V> PTable<K,Pair<U,V>> | Join. leftJoin(PTable<K,U> left,
        PTable<K,V> right)Performs a left outer join on the specified  PTables. | 
| static <K1,V1,K2 extends org.apache.hadoop.io.Writable,V2 extends org.apache.hadoop.io.Writable> | Mapreduce. map(PTable<K1,V1> input,
   Class<? extends org.apache.hadoop.mapreduce.Mapper<K1,V1,K2,V2>> mapperClass,
   Class<K2> keyClass,
   Class<V2> valueClass) | 
| static <K1,V1,K2 extends org.apache.hadoop.io.Writable,V2 extends org.apache.hadoop.io.Writable> | Mapred. map(PTable<K1,V1> input,
   Class<? extends org.apache.hadoop.mapred.Mapper<K1,V1,K2,V2>> mapperClass,
   Class<K2> keyClass,
   Class<V2> valueClass) | 
| static <K1,K2,V> PTable<K2,V> | PTables. mapKeys(PTable<K1,V> ptable,
       MapFn<K1,K2> mapFn,
       PType<K2> ptype)Maps a  PTable<K1, V>to aPTable<K2, V>using the givenMapFn<K1, K2>on
 the keys of thePTable. | 
| static <K1,K2,V> PTable<K2,V> | PTables. mapKeys(String name,
       PTable<K1,V> ptable,
       MapFn<K1,K2> mapFn,
       PType<K2> ptype)Maps a  PTable<K1, V>to aPTable<K2, V>using the givenMapFn<K1, K2>on
 the keys of thePTable. | 
| static <K,U,V> PTable<K,V> | PTables. mapValues(PTable<K,U> ptable,
         MapFn<U,V> mapFn,
         PType<V> ptype)Maps a  PTable<K, U>to aPTable<K, V>using the givenMapFn<U, V>on
 the values of thePTable. | 
| static <K,U,V> PTable<K,V> | PTables. mapValues(String name,
         PTable<K,U> ptable,
         MapFn<U,V> mapFn,
         PType<V> ptype)Maps a  PTable<K, U>to aPTable<K, V>using the givenMapFn<U, V>on
 the values of thePTable. | 
| static <K,V extends Number> | Average. meanValue(PTable<K,V> table)Calculate the mean average value by key for a table with numeric values. | 
| static <K> PTable<K,Long> | TopList. negateCounts(PTable<K,Long> table)When creating toplists, it is often required to sort by count descending. | 
| static <K,U,V> PTable<K,Pair<U,V>> | Join. rightJoin(PTable<K,U> left,
         PTable<K,V> right)Performs a right outer join on the specified  PTables. | 
| static <K,U,V> PTable<K,Pair<U,V>> | Join. rightJoin(PTable<K,U> left,
         PTable<K,V> right)Performs a right outer join on the specified  PTables. | 
| static <K,V> PTable<K,V> | Sample. sample(PTable<K,V> input,
      double probability)A  PTable<K, V>analogue of thesamplefunction. | 
| static <K,V> PTable<K,V> | Sample. sample(PTable<K,V> input,
      Long seed,
      double probability)A  PTable<K, V>analogue of thesamplefunction, with the seed argument
 exposed for testing purposes. | 
| static <K,V> PTable<K,V> | Sort. sort(PTable<K,V> table)Sorts the  PTableusing the natural ordering of its keys in ascending order. | 
| static <K,V> PTable<K,V> | Sort. sort(PTable<K,V> table,
    int numReducers,
    Sort.Order key)Sorts the  PTableusing the natural ordering of its keys in the
 order specified with a client-specified number of reducers. | 
| static <K,V> PTable<K,V> | Sort. sort(PTable<K,V> table,
    Sort.Order key)Sorts the  PTableusing the natural ordering of its keys with the givenOrder. | 
| static <K,V1,V2,U,V> | SecondarySort. sortAndApply(PTable<K,Pair<V1,V2>> input,
            DoFn<Pair<K,Iterable<Pair<V1,V2>>>,Pair<U,V>> doFn,
            PTableType<U,V> ptype)Perform a secondary sort on the given  PTableinstance and then apply aDoFnto the resulting sorted data to yield an outputPTable<U, V>. | 
| static <K,V1,V2,U,V> | SecondarySort. sortAndApply(PTable<K,Pair<V1,V2>> input,
            DoFn<Pair<K,Iterable<Pair<V1,V2>>>,Pair<U,V>> doFn,
            PTableType<U,V> ptype,
            int numReducers)Perform a secondary sort on the given  PTableinstance and then apply aDoFnto the resulting sorted data to yield an outputPTable<U, V>, using
 the given number of reducers. | 
| static <K,V1,V2,T> | SecondarySort. sortAndApply(PTable<K,Pair<V1,V2>> input,
            DoFn<Pair<K,Iterable<Pair<V1,V2>>>,T> doFn,
            PType<T> ptype)Perform a secondary sort on the given  PTableinstance and then apply aDoFnto the resulting sorted data to yield an outputPCollection<T>. | 
| static <K,V1,V2,T> | SecondarySort. sortAndApply(PTable<K,Pair<V1,V2>> input,
            DoFn<Pair<K,Iterable<Pair<V1,V2>>>,T> doFn,
            PType<T> ptype,
            int numReducers)Perform a secondary sort on the given  PTableinstance and then apply aDoFnto the resulting sorted data to yield an outputPCollection<T>, using
 the given number of reducers. | 
| static <K,V> PTable<V,K> | PTables. swapKeyValue(PTable<K,V> table)Swap the key and value part of a table. | 
| static <K,V> PTable<K,V> | Aggregate. top(PTable<K,V> ptable,
   int limit,
   boolean maximize)Selects the top N pairs from the given table, with sorting being performed on the values (i.e. | 
| static <X,Y> PTable<X,Collection<Pair<Long,Y>>> | TopList. topNYbyX(PTable<X,Y> input,
        int n)Create a top-list of elements in the provided PTable, categorised by the key of the input table and using the count
 of the value part of the input table. | 
| static <K,V> PCollection<V> | PTables. values(PTable<K,V> ptable)Extract the values from the given  PTable<K, V>as aPCollection<V>. | 
| Modifier and Type | Method and Description | 
|---|---|
| PTable<K,Pair<U,V>> | DefaultJoinStrategy. join(PTable<K,U> left,
    PTable<K,V> right,
    JoinFn<K,U,V> joinFn)Perform a default join on the given  PTableinstances using a user-specifiedJoinFn. | 
| PTable<K,Pair<U,V>> | ShardedJoinStrategy. join(PTable<K,U> left,
    PTable<K,V> right,
    JoinType joinType) | 
| PTable<K,Pair<U,V>> | MapsideJoinStrategy. join(PTable<K,U> left,
    PTable<K,V> right,
    JoinType joinType) | 
| PTable<K,Pair<U,V>> | JoinStrategy. join(PTable<K,U> left,
    PTable<K,V> right,
    JoinType joinType)Join two tables with the given join type. | 
| PTable<K,Pair<U,V>> | DefaultJoinStrategy. join(PTable<K,U> left,
    PTable<K,V> right,
    JoinType joinType) | 
| PTable<K,Pair<U,V>> | BloomFilterJoinStrategy. join(PTable<K,U> left,
    PTable<K,V> right,
    JoinType joinType) | 
| Modifier and Type | Method and Description | 
|---|---|
| PTable<K,Pair<U,V>> | DefaultJoinStrategy. join(PTable<K,U> left,
    PTable<K,V> right,
    JoinFn<K,U,V> joinFn)Perform a default join on the given  PTableinstances using a user-specifiedJoinFn. | 
| PTable<K,Pair<U,V>> | DefaultJoinStrategy. join(PTable<K,U> left,
    PTable<K,V> right,
    JoinFn<K,U,V> joinFn)Perform a default join on the given  PTableinstances using a user-specifiedJoinFn. | 
| PTable<K,Pair<U,V>> | ShardedJoinStrategy. join(PTable<K,U> left,
    PTable<K,V> right,
    JoinType joinType) | 
| PTable<K,Pair<U,V>> | ShardedJoinStrategy. join(PTable<K,U> left,
    PTable<K,V> right,
    JoinType joinType) | 
| PTable<K,Pair<U,V>> | MapsideJoinStrategy. join(PTable<K,U> left,
    PTable<K,V> right,
    JoinType joinType) | 
| PTable<K,Pair<U,V>> | MapsideJoinStrategy. join(PTable<K,U> left,
    PTable<K,V> right,
    JoinType joinType) | 
| PTable<K,Pair<U,V>> | JoinStrategy. join(PTable<K,U> left,
    PTable<K,V> right,
    JoinType joinType)Join two tables with the given join type. | 
| PTable<K,Pair<U,V>> | JoinStrategy. join(PTable<K,U> left,
    PTable<K,V> right,
    JoinType joinType)Join two tables with the given join type. | 
| PTable<K,Pair<U,V>> | DefaultJoinStrategy. join(PTable<K,U> left,
    PTable<K,V> right,
    JoinType joinType) | 
| PTable<K,Pair<U,V>> | DefaultJoinStrategy. join(PTable<K,U> left,
    PTable<K,V> right,
    JoinType joinType) | 
| PTable<K,Pair<U,V>> | BloomFilterJoinStrategy. join(PTable<K,U> left,
    PTable<K,V> right,
    JoinType joinType) | 
| PTable<K,Pair<U,V>> | BloomFilterJoinStrategy. join(PTable<K,U> left,
    PTable<K,V> right,
    JoinType joinType) | 
| static <K,U,V,T> PCollection<T> | OneToManyJoin. oneToManyJoin(PTable<K,U> left,
             PTable<K,V> right,
             DoFn<Pair<U,Iterable<V>>,T> postProcessFn,
             PType<T> ptype)Performs a join on two tables, where the left table only contains a single
 value per key. | 
| static <K,U,V,T> PCollection<T> | OneToManyJoin. oneToManyJoin(PTable<K,U> left,
             PTable<K,V> right,
             DoFn<Pair<U,Iterable<V>>,T> postProcessFn,
             PType<T> ptype)Performs a join on two tables, where the left table only contains a single
 value per key. | 
| static <K,U,V,T> PCollection<T> | OneToManyJoin. oneToManyJoin(PTable<K,U> left,
             PTable<K,V> right,
             DoFn<Pair<U,Iterable<V>>,T> postProcessFn,
             PType<T> ptype,
             int numReducers)Supports a user-specified number of reducers for the one-to-many join. | 
| static <K,U,V,T> PCollection<T> | OneToManyJoin. oneToManyJoin(PTable<K,U> left,
             PTable<K,V> right,
             DoFn<Pair<U,Iterable<V>>,T> postProcessFn,
             PType<T> ptype,
             int numReducers)Supports a user-specified number of reducers for the one-to-many join. | 
| Modifier and Type | Method and Description | 
|---|---|
| <K,V> PTable<K,V> | CrunchTool. read(TableSource<K,V> tableSource) | 
Copyright © 2016 The Apache Software Foundation. All rights reserved.