This project has retired. For details please refer to its Attic page.
Uses of Interface org.apache.crunch.PTable (Apache Crunch 0.10.0 API)

Uses of Interface
org.apache.crunch.PTable

Packages that use PTable
org.apache.crunch Client-facing API and core abstractions. 
org.apache.crunch.contrib.text   
org.apache.crunch.examples Example applications demonstrating various aspects of Crunch. 
org.apache.crunch.impl.dist   
org.apache.crunch.impl.dist.collect   
org.apache.crunch.impl.mem In-memory Pipeline implementation for rapid prototyping and testing. 
org.apache.crunch.impl.spark   
org.apache.crunch.impl.spark.collect   
org.apache.crunch.lib Joining, sorting, aggregating, and other commonly used functionality. 
org.apache.crunch.lib.join Inner and outer joins on collections. 
org.apache.crunch.util An assorted set of utilities. 
 

Uses of PTable in org.apache.crunch
 

Methods in org.apache.crunch that return PTable
 PTable<K,V> PTable.bottom(int count)
          Returns a PTable made up of the pairs in this PTable with the smallest value field.
<K> PTable<K,S>
PCollection.by(MapFn<S,K> extractKeyFn, PType<K> keyType)
          Apply the given map function to each element of this instance in order to create a PTable.
<K> PTable<K,S>
PCollection.by(String name, MapFn<S,K> extractKeyFn, PType<K> keyType)
          Apply the given map function to each element of this instance in order to create a PTable.
 PTable<K,V> PTable.cache()
           
 PTable<K,V> PTable.cache(CachingOptions options)
           
<U> PTable<K,Pair<Collection<V>,Collection<U>>>
PTable.cogroup(PTable<K,U> other)
          Co-group operation with the given table on common keys.
 PTable<K,Collection<V>> PTable.collectValues()
          Aggregate all of the values with the same key into a single key-value pair in the returned PTable.
 PTable<K,V> PGroupedTable.combineValues(Aggregator<V> aggregator)
          Combine the values in each group using the given Aggregator.
 PTable<K,V> PGroupedTable.combineValues(Aggregator<V> combineAggregator, Aggregator<V> reduceAggregator)
          Combine and reduces the values in each group using the given Aggregator instances.
 PTable<K,V> PGroupedTable.combineValues(CombineFn<K,V> combineFn)
          Combines the values of this grouping using the given CombineFn.
 PTable<K,V> PGroupedTable.combineValues(CombineFn<K,V> combineFn, CombineFn<K,V> reduceFn)
          Combines and reduces the values of this grouping using the given CombineFn instances.
 PTable<S,Long> PCollection.count()
          Returns a PTable instance that contains the counts of each unique element of this PCollection.
<K,V> PTable<K,V>
Pipeline.emptyPTable(PTableType<K,V> ptype)
           
 PTable<K,V> PTable.filter(FilterFn<Pair<K,V>> filterFn)
          Apply the given filter function to this instance and return the resulting PTable.
 PTable<K,V> PTable.filter(String name, FilterFn<Pair<K,V>> filterFn)
          Apply the given filter function to this instance and return the resulting PTable.
<U> PTable<K,Pair<V,U>>
PTable.join(PTable<K,U> other)
          Perform an inner join on this table and the one passed in as an argument on their common keys.
<K2> PTable<K2,V>
PTable.mapKeys(MapFn<K,K2> mapFn, PType<K2> ptype)
          Returns a PTable that has the same values as this instance, but uses the given function to map the keys.
<K2> PTable<K2,V>
PTable.mapKeys(String name, MapFn<K,K2> mapFn, PType<K2> ptype)
          Returns a PTable that has the same values as this instance, but uses the given function to map the keys.
<U> PTable<K,U>
PGroupedTable.mapValues(MapFn<Iterable<V>,U> mapFn, PType<U> ptype)
          Maps the Iterable<V> elements of each record to a new type.
<U> PTable<K,U>
PTable.mapValues(MapFn<V,U> mapFn, PType<U> ptype)
          Returns a PTable that has the same keys as this instance, but uses the given function to map the values.
<U> PTable<K,U>
PGroupedTable.mapValues(String name, MapFn<Iterable<V>,U> mapFn, PType<U> ptype)
          Maps the Iterable<V> elements of each record to a new type.
<U> PTable<K,U>
PTable.mapValues(String name, MapFn<V,U> mapFn, PType<U> ptype)
          Returns a PTable that has the same keys as this instance, but uses the given function to map the values.
<K,V> PTable<K,V>
PCollection.parallelDo(DoFn<S,Pair<K,V>> doFn, PTableType<K,V> type)
          Similar to the other parallelDo instance, but returns a PTable instance instead of a PCollection.
<K,V> PTable<K,V>
PCollection.parallelDo(String name, DoFn<S,Pair<K,V>> doFn, PTableType<K,V> type)
          Similar to the other parallelDo instance, but returns a PTable instance instead of a PCollection.
<K,V> PTable<K,V>
PCollection.parallelDo(String name, DoFn<S,Pair<K,V>> doFn, PTableType<K,V> type, ParallelDoOptions options)
          Similar to the other parallelDo instance, but returns a PTable instance instead of a PCollection.
<K,V> PTable<K,V>
Pipeline.read(TableSource<K,V> tableSource)
          A version of the read method for TableSource instances that map to PTables.
 PTable<K,V> PTable.top(int count)
          Returns a PTable made up of the pairs in this PTable with the largest value field.
 PTable<K,V> PGroupedTable.ungroup()
          Convert this grouping back into a multimap.
 PTable<K,V> PTable.union(PTable<K,V>... others)
          Returns a PTable instance that acts as the union of this PTable and the input PTables.
 PTable<K,V> PTable.union(PTable<K,V> other)
          Returns a PTable instance that acts as the union of this PTable and the other PTables.
 PTable<K,V> PTable.write(Target target)
          Writes this PTable to the given Target.
 PTable<K,V> PTable.write(Target target, Target.WriteMode writeMode)
          Writes this PTable to the given Target, using the given Target.WriteMode to handle existing targets.
 

Methods in org.apache.crunch with parameters of type PTable
<U> PTable<K,Pair<Collection<V>,Collection<U>>>
PTable.cogroup(PTable<K,U> other)
          Co-group operation with the given table on common keys.
<U> PTable<K,Pair<V,U>>
PTable.join(PTable<K,U> other)
          Perform an inner join on this table and the one passed in as an argument on their common keys.
 PTable<K,V> PTable.union(PTable<K,V>... others)
          Returns a PTable instance that acts as the union of this PTable and the input PTables.
 PTable<K,V> PTable.union(PTable<K,V> other)
          Returns a PTable instance that acts as the union of this PTable and the other PTables.
 

Uses of PTable in org.apache.crunch.contrib.text
 

Methods in org.apache.crunch.contrib.text that return PTable
static
<K,V> PTable<K,V>
Parse.parseTable(String groupName, PCollection<String> input, Extractor<Pair<K,V>> extractor)
          Parses the lines of the input PCollection<String> and returns a PTable<K, V> using the given Extractor<Pair<K, V>>.
static
<K,V> PTable<K,V>
Parse.parseTable(String groupName, PCollection<String> input, PTypeFamily ptf, Extractor<Pair<K,V>> extractor)
          Parses the lines of the input PCollection<String> and returns a PTable<K, V> using the given Extractor<Pair<K, V>> that uses the given PTypeFamily.
 

Uses of PTable in org.apache.crunch.examples
 

Methods in org.apache.crunch.examples that return PTable
 PTable<String,String> WordAggregationHBase.extractText(PTable<org.apache.hadoop.hbase.io.ImmutableBytesWritable,org.apache.hadoop.hbase.client.Result> words)
          Extract information from hbase
 

Methods in org.apache.crunch.examples with parameters of type PTable
 PCollection<org.apache.hadoop.hbase.client.Put> WordAggregationHBase.createPut(PTable<String,String> extractedText)
          Create puts in order to insert them in hbase.
 PTable<String,String> WordAggregationHBase.extractText(PTable<org.apache.hadoop.hbase.io.ImmutableBytesWritable,org.apache.hadoop.hbase.client.Result> words)
          Extract information from hbase
 

Uses of PTable in org.apache.crunch.impl.dist
 

Methods in org.apache.crunch.impl.dist that return PTable
<K,V> PTable<K,V>
DistributedPipeline.emptyPTable(PTableType<K,V> ptype)
           
<K,V> PTable<K,V>
DistributedPipeline.read(TableSource<K,V> source)
           
 

Uses of PTable in org.apache.crunch.impl.dist.collect
 

Classes in org.apache.crunch.impl.dist.collect that implement PTable
 class BaseDoTable<K,V>
           
 class BaseInputTable<K,V>
           
 class BaseUnionTable<K,V>
           
 class EmptyPTable<K,V>
           
 class PTableBase<K,V>
           
 

Methods in org.apache.crunch.impl.dist.collect that return PTable
 PTable<K,V> PTableBase.bottom(int count)
           
<K> PTable<K,S>
PCollectionImpl.by(MapFn<S,K> mapFn, PType<K> keyType)
           
<K> PTable<K,S>
PCollectionImpl.by(String name, MapFn<S,K> mapFn, PType<K> keyType)
           
 PTable<K,V> PTableBase.cache()
           
 PTable<K,V> PTableBase.cache(CachingOptions options)
           
<U> PTable<K,Pair<Collection<V>,Collection<U>>>
PTableBase.cogroup(PTable<K,U> other)
           
 PTable<K,Collection<V>> PTableBase.collectValues()
           
 PTable<K,V> BaseGroupedTable.combineValues(Aggregator<V> agg)
           
 PTable<K,V> BaseGroupedTable.combineValues(Aggregator<V> combineAgg, Aggregator<V> reduceAgg)
           
 PTable<K,V> BaseGroupedTable.combineValues(CombineFn<K,V> combineFn)
           
 PTable<K,V> BaseGroupedTable.combineValues(CombineFn<K,V> combineFn, CombineFn<K,V> reduceFn)
           
 PTable<S,Long> PCollectionImpl.count()
           
<K,V> PTable<K,V>
PCollectionFactory.createUnionTable(List<PTableBase<K,V>> internal)
           
 PTable<K,V> PTableBase.filter(FilterFn<Pair<K,V>> filterFn)
           
 PTable<K,V> PTableBase.filter(String name, FilterFn<Pair<K,V>> filterFn)
           
<U> PTable<K,Pair<V,U>>
PTableBase.join(PTable<K,U> other)
           
<K2> PTable<K2,V>
PTableBase.mapKeys(MapFn<K,K2> mapFn, PType<K2> ptype)
           
<K2> PTable<K2,V>
PTableBase.mapKeys(String name, MapFn<K,K2> mapFn, PType<K2> ptype)
           
<U> PTable<K,U>
BaseGroupedTable.mapValues(MapFn<Iterable<V>,U> mapFn, PType<U> ptype)
           
<U> PTable<K,U>
PTableBase.mapValues(MapFn<V,U> mapFn, PType<U> ptype)
           
<U> PTable<K,U>
BaseGroupedTable.mapValues(String name, MapFn<Iterable<V>,U> mapFn, PType<U> ptype)
           
<U> PTable<K,U>
PTableBase.mapValues(String name, MapFn<V,U> mapFn, PType<U> ptype)
           
<K,V> PTable<K,V>
PCollectionImpl.parallelDo(DoFn<S,Pair<K,V>> fn, PTableType<K,V> type)
           
<K,V> PTable<K,V>
PCollectionImpl.parallelDo(String name, DoFn<S,Pair<K,V>> fn, PTableType<K,V> type)
           
<K,V> PTable<K,V>
PCollectionImpl.parallelDo(String name, DoFn<S,Pair<K,V>> fn, PTableType<K,V> type, ParallelDoOptions options)
           
 PTable<K,V> PTableBase.top(int count)
           
 PTable<K,V> BaseGroupedTable.ungroup()
           
 PTable<K,V> PTableBase.union(PTable<K,V>... others)
           
 PTable<K,V> PTableBase.union(PTable<K,V> other)
           
 PTable<K,V> PTableBase.write(Target target)
           
 PTable<K,V> PTableBase.write(Target target, Target.WriteMode writeMode)
           
 

Methods in org.apache.crunch.impl.dist.collect with parameters of type PTable
<U> PTable<K,Pair<Collection<V>,Collection<U>>>
PTableBase.cogroup(PTable<K,U> other)
           
<U> PTable<K,Pair<V,U>>
PTableBase.join(PTable<K,U> other)
           
 PTable<K,V> PTableBase.union(PTable<K,V>... others)
           
 PTable<K,V> PTableBase.union(PTable<K,V> other)
           
 

Uses of PTable in org.apache.crunch.impl.mem
 

Methods in org.apache.crunch.impl.mem that return PTable
<K,V> PTable<K,V>
MemPipeline.emptyPTable(PTableType<K,V> ptype)
           
<K,V> PTable<K,V>
MemPipeline.read(TableSource<K,V> source)
           
static
<S,T> PTable<S,T>
MemPipeline.tableOf(Iterable<Pair<S,T>> pairs)
           
static
<S,T> PTable<S,T>
MemPipeline.tableOf(S s, T t, Object... more)
           
static
<S,T> PTable<S,T>
MemPipeline.typedTableOf(PTableType<S,T> ptype, Iterable<Pair<S,T>> pairs)
           
static
<S,T> PTable<S,T>
MemPipeline.typedTableOf(PTableType<S,T> ptype, S s, T t, Object... more)
           
 

Uses of PTable in org.apache.crunch.impl.spark
 

Methods in org.apache.crunch.impl.spark that return PTable
<K,V> PTable<K,V>
SparkPipeline.emptyPTable(PTableType<K,V> ptype)
           
 

Uses of PTable in org.apache.crunch.impl.spark.collect
 

Classes in org.apache.crunch.impl.spark.collect that implement PTable
 class DoTable<K,V>
           
 class InputTable<K,V>
           
 class UnionTable<K,V>
           
 

Methods in org.apache.crunch.impl.spark.collect that return PTable
<K,V> PTable<K,V>
SparkCollectFactory.createUnionTable(List<PTableBase<K,V>> internal)
           
 

Uses of PTable in org.apache.crunch.lib
 

Methods in org.apache.crunch.lib that return PTable
static
<K,V> PTable<K,V>
PTables.asPTable(PCollection<Pair<K,V>> pcollect)
          Convert the given PCollection<Pair<K, V>> to a PTable<K, V>.
static
<K,U,V> PTable<K,TupleN>
Cogroup.cogroup(int numReducers, PTable<K,?> first, PTable<K,?>... rest)
          Co-groups an arbitrary number of PTable arguments with a user-specified degree of parallelism (a.k.a, number of reducers.) The largest table should come last in the ordering.
static
<K,U,V> PTable<K,Pair<Collection<U>,Collection<V>>>
Cogroup.cogroup(int numReducers, PTable<K,U> left, PTable<K,V> right)
          Co-groups the two PTable arguments with a user-specified degree of parallelism (a.k.a, number of reducers.)
static
<K,V1,V2,V3>
PTable<K,Tuple3.Collect<V1,V2,V3>>
Cogroup.cogroup(int numReducers, PTable<K,V1> first, PTable<K,V2> second, PTable<K,V3> third)
          Co-groups the three PTable arguments with a user-specified degree of parallelism (a.k.a, number of reducers.)
static
<K,V1,V2,V3,V4>
PTable<K,Tuple4.Collect<V1,V2,V3,V4>>
Cogroup.cogroup(int numReducers, PTable<K,V1> first, PTable<K,V2> second, PTable<K,V3> third, PTable<K,V4> fourth)
          Co-groups the three PTable arguments with a user-specified degree of parallelism (a.k.a, number of reducers.)
static
<K> PTable<K,TupleN>
Cogroup.cogroup(PTable<K,?> first, PTable<K,?>... rest)
          Co-groups an arbitrary number of PTable arguments.
static
<K,U,V> PTable<K,Pair<Collection<U>,Collection<V>>>
Cogroup.cogroup(PTable<K,U> left, PTable<K,V> right)
          Co-groups the two PTable arguments.
static
<K,V1,V2,V3>
PTable<K,Tuple3.Collect<V1,V2,V3>>
Cogroup.cogroup(PTable<K,V1> first, PTable<K,V2> second, PTable<K,V3> third)
          Co-groups the three PTable arguments.
static
<K,V1,V2,V3,V4>
PTable<K,Tuple4.Collect<V1,V2,V3,V4>>
Cogroup.cogroup(PTable<K,V1> first, PTable<K,V2> second, PTable<K,V3> third, PTable<K,V4> fourth)
          Co-groups the three PTable arguments.
static
<K,V> PTable<K,Collection<V>>
Aggregate.collectValues(PTable<K,V> collect)
           
static
<S> PTable<S,Long>
Aggregate.count(PCollection<S> collect)
          Returns a PTable that contains the unique elements of this collection mapped to a count of their occurrences.
static
<S> PTable<S,Long>
Aggregate.count(PCollection<S> collect, int numPartitions)
          Returns a PTable that contains the unique elements of this collection mapped to a count of their occurrences.
static
<K1,K2,U,V>
PTable<Pair<K1,K2>,Pair<U,V>>
Cartesian.cross(PTable<K1,U> left, PTable<K2,V> right)
          Performs a full cross join on the specified PTables (using the same strategy as Pig's CROSS operator).
static
<K1,K2,U,V>
PTable<Pair<K1,K2>,Pair<U,V>>
Cartesian.cross(PTable<K1,U> left, PTable<K2,V> right, int parallelism)
          Performs a full cross join on the specified PTables (using the same strategy as Pig's CROSS operator).
static
<K,V> PTable<K,V>
Distinct.distinct(PTable<K,V> input)
          A PTable<K, V> analogue of the distinct function.
static
<K,V> PTable<K,V>
Distinct.distinct(PTable<K,V> input, int flushEvery)
          A PTable<K, V> analogue of the distinct function.
static
<K,U,V> PTable<K,Pair<U,V>>
Join.fullJoin(PTable<K,U> left, PTable<K,V> right)
          Performs a full outer join on the specified PTables.
static
<K,U,V> PTable<K,Pair<U,V>>
Join.innerJoin(PTable<K,U> left, PTable<K,V> right)
          Performs an inner join on the specified PTables.
static
<K,U,V> PTable<K,Pair<U,V>>
Join.join(PTable<K,U> left, PTable<K,V> right)
          Performs an inner join on the specified PTables.
static
<K,U,V> PTable<K,Pair<U,V>>
Join.leftJoin(PTable<K,U> left, PTable<K,V> right)
          Performs a left outer join on the specified PTables.
static
<K1,V1,K2 extends org.apache.hadoop.io.Writable,V2 extends org.apache.hadoop.io.Writable>
PTable<K2,V2>
Mapreduce.map(PTable<K1,V1> input, Class<? extends org.apache.hadoop.mapreduce.Mapper<K1,V1,K2,V2>> mapperClass, Class<K2> keyClass, Class<V2> valueClass)
           
static
<K1,V1,K2 extends org.apache.hadoop.io.Writable,V2 extends org.apache.hadoop.io.Writable>
PTable<K2,V2>
Mapred.map(PTable<K1,V1> input, Class<? extends org.apache.hadoop.mapred.Mapper<K1,V1,K2,V2>> mapperClass, Class<K2> keyClass, Class<V2> valueClass)
           
static
<K1,K2,V> PTable<K2,V>
PTables.mapKeys(PTable<K1,V> ptable, MapFn<K1,K2> mapFn, PType<K2> ptype)
          Maps a PTable<K1, V> to a PTable<K2, V> using the given MapFn<K1, K2> on the keys of the PTable.
static
<K1,K2,V> PTable<K2,V>
PTables.mapKeys(String name, PTable<K1,V> ptable, MapFn<K1,K2> mapFn, PType<K2> ptype)
          Maps a PTable<K1, V> to a PTable<K2, V> using the given MapFn<K1, K2> on the keys of the PTable.
static
<K,U,V> PTable<K,V>
PTables.mapValues(PGroupedTable<K,U> ptable, MapFn<Iterable<U>,V> mapFn, PType<V> ptype)
          An analogue of the mapValues function for PGroupedTable<K, U> collections.
static
<K,U,V> PTable<K,V>
PTables.mapValues(PTable<K,U> ptable, MapFn<U,V> mapFn, PType<V> ptype)
          Maps a PTable<K, U> to a PTable<K, V> using the given MapFn<U, V> on the values of the PTable.
static
<K,U,V> PTable<K,V>
PTables.mapValues(String name, PGroupedTable<K,U> ptable, MapFn<Iterable<U>,V> mapFn, PType<V> ptype)
          An analogue of the mapValues function for PGroupedTable<K, U> collections.
static
<K,U,V> PTable<K,V>
PTables.mapValues(String name, PTable<K,U> ptable, MapFn<U,V> mapFn, PType<V> ptype)
          Maps a PTable<K, U> to a PTable<K, V> using the given MapFn<U, V> on the values of the PTable.
static
<K1,V1,K2 extends org.apache.hadoop.io.Writable,V2 extends org.apache.hadoop.io.Writable>
PTable<K2,V2>
Mapreduce.reduce(PGroupedTable<K1,V1> input, Class<? extends org.apache.hadoop.mapreduce.Reducer<K1,V1,K2,V2>> reducerClass, Class<K2> keyClass, Class<V2> valueClass)
           
static
<K1,V1,K2 extends org.apache.hadoop.io.Writable,V2 extends org.apache.hadoop.io.Writable>
PTable<K2,V2>
Mapred.reduce(PGroupedTable<K1,V1> input, Class<? extends org.apache.hadoop.mapred.Reducer<K1,V1,K2,V2>> reducerClass, Class<K2> keyClass, Class<V2> valueClass)
           
static
<K,U,V> PTable<K,Pair<U,V>>
Join.rightJoin(PTable<K,U> left, PTable<K,V> right)
          Performs a right outer join on the specified PTables.
static
<K,V> PTable<K,V>
Sample.sample(PTable<K,V> input, double probability)
          A PTable<K, V> analogue of the sample function.
static
<K,V> PTable<K,V>
Sample.sample(PTable<K,V> input, Long seed, double probability)
          A PTable<K, V> analogue of the sample function, with the seed argument exposed for testing purposes.
static
<K,V> PTable<K,V>
Sort.sort(PTable<K,V> table)
          Sorts the PTable using the natural ordering of its keys in ascending order.
static
<K,V> PTable<K,V>
Sort.sort(PTable<K,V> table, int numReducers, Sort.Order key)
          Sorts the PTable using the natural ordering of its keys in the order specified with a client-specified number of reducers.
static
<K,V> PTable<K,V>
Sort.sort(PTable<K,V> table, Sort.Order key)
          Sorts the PTable using the natural ordering of its keys with the given Order.
static
<K,V1,V2,U,V>
PTable<U,V>
SecondarySort.sortAndApply(PTable<K,Pair<V1,V2>> input, DoFn<Pair<K,Iterable<Pair<V1,V2>>>,Pair<U,V>> doFn, PTableType<U,V> ptype)
          Perform a secondary sort on the given PTable instance and then apply a DoFn to the resulting sorted data to yield an output PTable<U, V>.
static
<K,V1,V2,U,V>
PTable<U,V>
SecondarySort.sortAndApply(PTable<K,Pair<V1,V2>> input, DoFn<Pair<K,Iterable<Pair<V1,V2>>>,Pair<U,V>> doFn, PTableType<U,V> ptype, int numReducers)
          Perform a secondary sort on the given PTable instance and then apply a DoFn to the resulting sorted data to yield an output PTable<U, V>, using the given number of reducers.
static
<K,V> PTable<K,V>
Aggregate.top(PTable<K,V> ptable, int limit, boolean maximize)
          Selects the top N pairs from the given table, with sorting being performed on the values (i.e.
 

Methods in org.apache.crunch.lib with parameters of type PTable
static
<K,U,V> PTable<K,TupleN>
Cogroup.cogroup(int numReducers, PTable<K,?> first, PTable<K,?>... rest)
          Co-groups an arbitrary number of PTable arguments with a user-specified degree of parallelism (a.k.a, number of reducers.) The largest table should come last in the ordering.
static
<K,U,V> PTable<K,TupleN>
Cogroup.cogroup(int numReducers, PTable<K,?> first, PTable<K,?>... rest)
          Co-groups an arbitrary number of PTable arguments with a user-specified degree of parallelism (a.k.a, number of reducers.) The largest table should come last in the ordering.
static
<K,U,V> PTable<K,Pair<Collection<U>,Collection<V>>>
Cogroup.cogroup(int numReducers, PTable<K,U> left, PTable<K,V> right)
          Co-groups the two PTable arguments with a user-specified degree of parallelism (a.k.a, number of reducers.)
static
<K,U,V> PTable<K,Pair<Collection<U>,Collection<V>>>
Cogroup.cogroup(int numReducers, PTable<K,U> left, PTable<K,V> right)
          Co-groups the two PTable arguments with a user-specified degree of parallelism (a.k.a, number of reducers.)
static
<K,V1,V2,V3>
PTable<K,Tuple3.Collect<V1,V2,V3>>
Cogroup.cogroup(int numReducers, PTable<K,V1> first, PTable<K,V2> second, PTable<K,V3> third)
          Co-groups the three PTable arguments with a user-specified degree of parallelism (a.k.a, number of reducers.)
static
<K,V1,V2,V3>
PTable<K,Tuple3.Collect<V1,V2,V3>>
Cogroup.cogroup(int numReducers, PTable<K,V1> first, PTable<K,V2> second, PTable<K,V3> third)
          Co-groups the three PTable arguments with a user-specified degree of parallelism (a.k.a, number of reducers.)
static
<K,V1,V2,V3>
PTable<K,Tuple3.Collect<V1,V2,V3>>
Cogroup.cogroup(int numReducers, PTable<K,V1> first, PTable<K,V2> second, PTable<K,V3> third)
          Co-groups the three PTable arguments with a user-specified degree of parallelism (a.k.a, number of reducers.)
static
<K,V1,V2,V3,V4>
PTable<K,Tuple4.Collect<V1,V2,V3,V4>>
Cogroup.cogroup(int numReducers, PTable<K,V1> first, PTable<K,V2> second, PTable<K,V3> third, PTable<K,V4> fourth)
          Co-groups the three PTable arguments with a user-specified degree of parallelism (a.k.a, number of reducers.)
static
<K,V1,V2,V3,V4>
PTable<K,Tuple4.Collect<V1,V2,V3,V4>>
Cogroup.cogroup(int numReducers, PTable<K,V1> first, PTable<K,V2> second, PTable<K,V3> third, PTable<K,V4> fourth)
          Co-groups the three PTable arguments with a user-specified degree of parallelism (a.k.a, number of reducers.)
static
<K,V1,V2,V3,V4>
PTable<K,Tuple4.Collect<V1,V2,V3,V4>>
Cogroup.cogroup(int numReducers, PTable<K,V1> first, PTable<K,V2> second, PTable<K,V3> third, PTable<K,V4> fourth)
          Co-groups the three PTable arguments with a user-specified degree of parallelism (a.k.a, number of reducers.)
static
<K,V1,V2,V3,V4>
PTable<K,Tuple4.Collect<V1,V2,V3,V4>>
Cogroup.cogroup(int numReducers, PTable<K,V1> first, PTable<K,V2> second, PTable<K,V3> third, PTable<K,V4> fourth)
          Co-groups the three PTable arguments with a user-specified degree of parallelism (a.k.a, number of reducers.)
static
<K> PTable<K,TupleN>
Cogroup.cogroup(PTable<K,?> first, PTable<K,?>... rest)
          Co-groups an arbitrary number of PTable arguments.
static
<K> PTable<K,TupleN>
Cogroup.cogroup(PTable<K,?> first, PTable<K,?>... rest)
          Co-groups an arbitrary number of PTable arguments.
static
<K,U,V> PTable<K,Pair<Collection<U>,Collection<V>>>
Cogroup.cogroup(PTable<K,U> left, PTable<K,V> right)
          Co-groups the two PTable arguments.
static
<K,U,V> PTable<K,Pair<Collection<U>,Collection<V>>>
Cogroup.cogroup(PTable<K,U> left, PTable<K,V> right)
          Co-groups the two PTable arguments.
static
<K,V1,V2,V3>
PTable<K,Tuple3.Collect<V1,V2,V3>>
Cogroup.cogroup(PTable<K,V1> first, PTable<K,V2> second, PTable<K,V3> third)
          Co-groups the three PTable arguments.
static
<K,V1,V2,V3>
PTable<K,Tuple3.Collect<V1,V2,V3>>
Cogroup.cogroup(PTable<K,V1> first, PTable<K,V2> second, PTable<K,V3> third)
          Co-groups the three PTable arguments.
static
<K,V1,V2,V3>
PTable<K,Tuple3.Collect<V1,V2,V3>>
Cogroup.cogroup(PTable<K,V1> first, PTable<K,V2> second, PTable<K,V3> third)
          Co-groups the three PTable arguments.
static
<K,V1,V2,V3,V4>
PTable<K,Tuple4.Collect<V1,V2,V3,V4>>
Cogroup.cogroup(PTable<K,V1> first, PTable<K,V2> second, PTable<K,V3> third, PTable<K,V4> fourth)
          Co-groups the three PTable arguments.
static
<K,V1,V2,V3,V4>
PTable<K,Tuple4.Collect<V1,V2,V3,V4>>
Cogroup.cogroup(PTable<K,V1> first, PTable<K,V2> second, PTable<K,V3> third, PTable<K,V4> fourth)
          Co-groups the three PTable arguments.
static
<K,V1,V2,V3,V4>
PTable<K,Tuple4.Collect<V1,V2,V3,V4>>
Cogroup.cogroup(PTable<K,V1> first, PTable<K,V2> second, PTable<K,V3> third, PTable<K,V4> fourth)
          Co-groups the three PTable arguments.
static
<K,V1,V2,V3,V4>
PTable<K,Tuple4.Collect<V1,V2,V3,V4>>
Cogroup.cogroup(PTable<K,V1> first, PTable<K,V2> second, PTable<K,V3> third, PTable<K,V4> fourth)
          Co-groups the three PTable arguments.
static
<K,V> PTable<K,Collection<V>>
Aggregate.collectValues(PTable<K,V> collect)
           
static
<K1,K2,U,V>
PTable<Pair<K1,K2>,Pair<U,V>>
Cartesian.cross(PTable<K1,U> left, PTable<K2,V> right)
          Performs a full cross join on the specified PTables (using the same strategy as Pig's CROSS operator).
static
<K1,K2,U,V>
PTable<Pair<K1,K2>,Pair<U,V>>
Cartesian.cross(PTable<K1,U> left, PTable<K2,V> right)
          Performs a full cross join on the specified PTables (using the same strategy as Pig's CROSS operator).
static
<K1,K2,U,V>
PTable<Pair<K1,K2>,Pair<U,V>>
Cartesian.cross(PTable<K1,U> left, PTable<K2,V> right, int parallelism)
          Performs a full cross join on the specified PTables (using the same strategy as Pig's CROSS operator).
static
<K1,K2,U,V>
PTable<Pair<K1,K2>,Pair<U,V>>
Cartesian.cross(PTable<K1,U> left, PTable<K2,V> right, int parallelism)
          Performs a full cross join on the specified PTables (using the same strategy as Pig's CROSS operator).
static
<K,V> PTable<K,V>
Distinct.distinct(PTable<K,V> input)
          A PTable<K, V> analogue of the distinct function.
static
<K,V> PTable<K,V>
Distinct.distinct(PTable<K,V> input, int flushEvery)
          A PTable<K, V> analogue of the distinct function.
static
<K,U,V> PTable<K,Pair<U,V>>
Join.fullJoin(PTable<K,U> left, PTable<K,V> right)
          Performs a full outer join on the specified PTables.
static
<K,U,V> PTable<K,Pair<U,V>>
Join.fullJoin(PTable<K,U> left, PTable<K,V> right)
          Performs a full outer join on the specified PTables.
static
<T,N extends Number>
PCollection<Pair<Integer,T>>
Sample.groupedWeightedReservoirSample(PTable<Integer,Pair<T,N>> input, int[] sampleSizes)
          The most general purpose of the weighted reservoir sampling patterns that allows us to choose a random sample of elements for each of N input groups.
static
<T,N extends Number>
PCollection<Pair<Integer,T>>
Sample.groupedWeightedReservoirSample(PTable<Integer,Pair<T,N>> input, int[] sampleSizes, Long seed)
          Same as the other groupedWeightedReservoirSample method, but include a seed for testing purposes.
static
<K,U,V> PTable<K,Pair<U,V>>
Join.innerJoin(PTable<K,U> left, PTable<K,V> right)
          Performs an inner join on the specified PTables.
static
<K,U,V> PTable<K,Pair<U,V>>
Join.innerJoin(PTable<K,U> left, PTable<K,V> right)
          Performs an inner join on the specified PTables.
static
<K,U,V> PTable<K,Pair<U,V>>
Join.join(PTable<K,U> left, PTable<K,V> right)
          Performs an inner join on the specified PTables.
static
<K,U,V> PTable<K,Pair<U,V>>
Join.join(PTable<K,U> left, PTable<K,V> right)
          Performs an inner join on the specified PTables.
static
<K,V> PCollection<K>
PTables.keys(PTable<K,V> ptable)
          Extract the keys from the given PTable<K, V> as a PCollection<K>.
static
<K,U,V> PTable<K,Pair<U,V>>
Join.leftJoin(PTable<K,U> left, PTable<K,V> right)
          Performs a left outer join on the specified PTables.
static
<K,U,V> PTable<K,Pair<U,V>>
Join.leftJoin(PTable<K,U> left, PTable<K,V> right)
          Performs a left outer join on the specified PTables.
static
<K1,V1,K2 extends org.apache.hadoop.io.Writable,V2 extends org.apache.hadoop.io.Writable>
PTable<K2,V2>
Mapreduce.map(PTable<K1,V1> input, Class<? extends org.apache.hadoop.mapreduce.Mapper<K1,V1,K2,V2>> mapperClass, Class<K2> keyClass, Class<V2> valueClass)
           
static
<K1,V1,K2 extends org.apache.hadoop.io.Writable,V2 extends org.apache.hadoop.io.Writable>
PTable<K2,V2>
Mapred.map(PTable<K1,V1> input, Class<? extends org.apache.hadoop.mapred.Mapper<K1,V1,K2,V2>> mapperClass, Class<K2> keyClass, Class<V2> valueClass)
           
static
<K1,K2,V> PTable<K2,V>
PTables.mapKeys(PTable<K1,V> ptable, MapFn<K1,K2> mapFn, PType<K2> ptype)
          Maps a PTable<K1, V> to a PTable<K2, V> using the given MapFn<K1, K2> on the keys of the PTable.
static
<K1,K2,V> PTable<K2,V>
PTables.mapKeys(String name, PTable<K1,V> ptable, MapFn<K1,K2> mapFn, PType<K2> ptype)
          Maps a PTable<K1, V> to a PTable<K2, V> using the given MapFn<K1, K2> on the keys of the PTable.
static
<K,U,V> PTable<K,V>
PTables.mapValues(PTable<K,U> ptable, MapFn<U,V> mapFn, PType<V> ptype)
          Maps a PTable<K, U> to a PTable<K, V> using the given MapFn<U, V> on the values of the PTable.
static
<K,U,V> PTable<K,V>
PTables.mapValues(String name, PTable<K,U> ptable, MapFn<U,V> mapFn, PType<V> ptype)
          Maps a PTable<K, U> to a PTable<K, V> using the given MapFn<U, V> on the values of the PTable.
static
<K,U,V> PTable<K,Pair<U,V>>
Join.rightJoin(PTable<K,U> left, PTable<K,V> right)
          Performs a right outer join on the specified PTables.
static
<K,U,V> PTable<K,Pair<U,V>>
Join.rightJoin(PTable<K,U> left, PTable<K,V> right)
          Performs a right outer join on the specified PTables.
static
<K,V> PTable<K,V>
Sample.sample(PTable<K,V> input, double probability)
          A PTable<K, V> analogue of the sample function.
static
<K,V> PTable<K,V>
Sample.sample(PTable<K,V> input, Long seed, double probability)
          A PTable<K, V> analogue of the sample function, with the seed argument exposed for testing purposes.
static
<K,V> PTable<K,V>
Sort.sort(PTable<K,V> table)
          Sorts the PTable using the natural ordering of its keys in ascending order.
static
<K,V> PTable<K,V>
Sort.sort(PTable<K,V> table, int numReducers, Sort.Order key)
          Sorts the PTable using the natural ordering of its keys in the order specified with a client-specified number of reducers.
static
<K,V> PTable<K,V>
Sort.sort(PTable<K,V> table, Sort.Order key)
          Sorts the PTable using the natural ordering of its keys with the given Order.
static
<K,V1,V2,U,V>
PTable<U,V>
SecondarySort.sortAndApply(PTable<K,Pair<V1,V2>> input, DoFn<Pair<K,Iterable<Pair<V1,V2>>>,Pair<U,V>> doFn, PTableType<U,V> ptype)
          Perform a secondary sort on the given PTable instance and then apply a DoFn to the resulting sorted data to yield an output PTable<U, V>.
static
<K,V1,V2,U,V>
PTable<U,V>
SecondarySort.sortAndApply(PTable<K,Pair<V1,V2>> input, DoFn<Pair<K,Iterable<Pair<V1,V2>>>,Pair<U,V>> doFn, PTableType<U,V> ptype, int numReducers)
          Perform a secondary sort on the given PTable instance and then apply a DoFn to the resulting sorted data to yield an output PTable<U, V>, using the given number of reducers.
static
<K,V1,V2,T>
PCollection<T>
SecondarySort.sortAndApply(PTable<K,Pair<V1,V2>> input, DoFn<Pair<K,Iterable<Pair<V1,V2>>>,T> doFn, PType<T> ptype)
          Perform a secondary sort on the given PTable instance and then apply a DoFn to the resulting sorted data to yield an output PCollection<T>.
static
<K,V1,V2,T>
PCollection<T>
SecondarySort.sortAndApply(PTable<K,Pair<V1,V2>> input, DoFn<Pair<K,Iterable<Pair<V1,V2>>>,T> doFn, PType<T> ptype, int numReducers)
          Perform a secondary sort on the given PTable instance and then apply a DoFn to the resulting sorted data to yield an output PCollection<T>, using the given number of reducers.
static
<K,V> PTable<K,V>
Aggregate.top(PTable<K,V> ptable, int limit, boolean maximize)
          Selects the top N pairs from the given table, with sorting being performed on the values (i.e.
static
<K,V> PCollection<V>
PTables.values(PTable<K,V> ptable)
          Extract the values from the given PTable<K, V> as a PCollection<V>.
 

Uses of PTable in org.apache.crunch.lib.join
 

Methods in org.apache.crunch.lib.join that return PTable
 PTable<K,Pair<U,V>> DefaultJoinStrategy.join(PTable<K,U> left, PTable<K,V> right, JoinFn<K,U,V> joinFn)
          Perform a default join on the given PTable instances using a user-specified JoinFn.
 PTable<K,Pair<U,V>> ShardedJoinStrategy.join(PTable<K,U> left, PTable<K,V> right, JoinType joinType)
           
 PTable<K,Pair<U,V>> MapsideJoinStrategy.join(PTable<K,U> left, PTable<K,V> right, JoinType joinType)
           
 PTable<K,Pair<U,V>> JoinStrategy.join(PTable<K,U> left, PTable<K,V> right, JoinType joinType)
          Join two tables with the given join type.
 PTable<K,Pair<U,V>> DefaultJoinStrategy.join(PTable<K,U> left, PTable<K,V> right, JoinType joinType)
           
 PTable<K,Pair<U,V>> BloomFilterJoinStrategy.join(PTable<K,U> left, PTable<K,V> right, JoinType joinType)
           
 

Methods in org.apache.crunch.lib.join with parameters of type PTable
 PTable<K,Pair<U,V>> DefaultJoinStrategy.join(PTable<K,U> left, PTable<K,V> right, JoinFn<K,U,V> joinFn)
          Perform a default join on the given PTable instances using a user-specified JoinFn.
 PTable<K,Pair<U,V>> DefaultJoinStrategy.join(PTable<K,U> left, PTable<K,V> right, JoinFn<K,U,V> joinFn)
          Perform a default join on the given PTable instances using a user-specified JoinFn.
 PTable<K,Pair<U,V>> ShardedJoinStrategy.join(PTable<K,U> left, PTable<K,V> right, JoinType joinType)
           
 PTable<K,Pair<U,V>> ShardedJoinStrategy.join(PTable<K,U> left, PTable<K,V> right, JoinType joinType)
           
 PTable<K,Pair<U,V>> MapsideJoinStrategy.join(PTable<K,U> left, PTable<K,V> right, JoinType joinType)
           
 PTable<K,Pair<U,V>> MapsideJoinStrategy.join(PTable<K,U> left, PTable<K,V> right, JoinType joinType)
           
 PTable<K,Pair<U,V>> JoinStrategy.join(PTable<K,U> left, PTable<K,V> right, JoinType joinType)
          Join two tables with the given join type.
 PTable<K,Pair<U,V>> JoinStrategy.join(PTable<K,U> left, PTable<K,V> right, JoinType joinType)
          Join two tables with the given join type.
 PTable<K,Pair<U,V>> DefaultJoinStrategy.join(PTable<K,U> left, PTable<K,V> right, JoinType joinType)
           
 PTable<K,Pair<U,V>> DefaultJoinStrategy.join(PTable<K,U> left, PTable<K,V> right, JoinType joinType)
           
 PTable<K,Pair<U,V>> BloomFilterJoinStrategy.join(PTable<K,U> left, PTable<K,V> right, JoinType joinType)
           
 PTable<K,Pair<U,V>> BloomFilterJoinStrategy.join(PTable<K,U> left, PTable<K,V> right, JoinType joinType)
           
static
<K,U,V,T> PCollection<T>
OneToManyJoin.oneToManyJoin(PTable<K,U> left, PTable<K,V> right, DoFn<Pair<U,Iterable<V>>,T> postProcessFn, PType<T> ptype)
          Performs a join on two tables, where the left table only contains a single value per key.
static
<K,U,V,T> PCollection<T>
OneToManyJoin.oneToManyJoin(PTable<K,U> left, PTable<K,V> right, DoFn<Pair<U,Iterable<V>>,T> postProcessFn, PType<T> ptype)
          Performs a join on two tables, where the left table only contains a single value per key.
static
<K,U,V,T> PCollection<T>
OneToManyJoin.oneToManyJoin(PTable<K,U> left, PTable<K,V> right, DoFn<Pair<U,Iterable<V>>,T> postProcessFn, PType<T> ptype, int numReducers)
          Supports a user-specified number of reducers for the one-to-many join.
static
<K,U,V,T> PCollection<T>
OneToManyJoin.oneToManyJoin(PTable<K,U> left, PTable<K,V> right, DoFn<Pair<U,Iterable<V>>,T> postProcessFn, PType<T> ptype, int numReducers)
          Supports a user-specified number of reducers for the one-to-many join.
 

Uses of PTable in org.apache.crunch.util
 

Methods in org.apache.crunch.util that return PTable
<K,V> PTable<K,V>
CrunchTool.read(TableSource<K,V> tableSource)
           
 



Copyright © 2014 The Apache Software Foundation. All Rights Reserved.