This project has retired. For details please refer to its Attic page.
PTableBase (Apache Crunch 0.10.0 API)

org.apache.crunch.impl.dist.collect
Class PTableBase<K,V>

java.lang.Object
  extended by org.apache.crunch.impl.dist.collect.PCollectionImpl<Pair<K,V>>
      extended by org.apache.crunch.impl.dist.collect.PTableBase<K,V>
All Implemented Interfaces:
PCollection<Pair<K,V>>, PTable<K,V>
Direct Known Subclasses:
BaseDoTable, BaseInputTable, BaseUnionTable, EmptyPTable

public abstract class PTableBase<K,V>
extends PCollectionImpl<Pair<K,V>>
implements PTable<K,V>


Nested Class Summary
 
Nested classes/interfaces inherited from class org.apache.crunch.impl.dist.collect.PCollectionImpl
PCollectionImpl.Visitor
 
Constructor Summary
PTableBase(String name, DistributedPipeline pipeline)
           
PTableBase(String name, DistributedPipeline pipeline, ParallelDoOptions options)
           
 
Method Summary
 PObject<Map<K,V>> asMap()
          Returns a PObject encapsulating a Map made up of the keys and values in this PTable.
 PTable<K,V> bottom(int count)
          Returns a PTable made up of the pairs in this PTable with the smallest value field.
 PTable<K,V> cache()
          Marks this data as cached using the default CachingOptions.
 PTable<K,V> cache(CachingOptions options)
          Marks this data as cached using the given CachingOptions.
<U> PTable<K,Pair<Collection<V>,Collection<U>>>
cogroup(PTable<K,U> other)
          Co-group operation with the given table on common keys.
 PTable<K,Collection<V>> collectValues()
          Aggregate all of the values with the same key into a single key-value pair in the returned PTable.
 PTable<K,V> filter(FilterFn<Pair<K,V>> filterFn)
          Apply the given filter function to this instance and return the resulting PCollection.
 PTable<K,V> filter(String name, FilterFn<Pair<K,V>> filterFn)
          Apply the given filter function to this instance and return the resulting PCollection.
 PType<K> getKeyType()
          Returns the PType of the key.
 PType<V> getValueType()
          Returns the PType of the value.
 BaseGroupedTable<K,V> groupByKey()
          Performs a grouping operation on the keys of this table.
 BaseGroupedTable<K,V> groupByKey(GroupingOptions groupingOptions)
          Performs a grouping operation on the keys of this table, using the additional GroupingOptions to control how the grouping is executed.
 BaseGroupedTable<K,V> groupByKey(int numReduceTasks)
          Performs a grouping operation on the keys of this table, using the given number of partitions.
<U> PTable<K,Pair<V,U>>
join(PTable<K,U> other)
          Perform an inner join on this table and the one passed in as an argument on their common keys.
 PCollection<K> keys()
          Returns a PCollection made up of the keys in this PTable.
<K2> PTable<K2,V>
mapKeys(MapFn<K,K2> mapFn, PType<K2> ptype)
          Returns a PTable that has the same values as this instance, but uses the given function to map the keys.
<K2> PTable<K2,V>
mapKeys(String name, MapFn<K,K2> mapFn, PType<K2> ptype)
          Returns a PTable that has the same values as this instance, but uses the given function to map the keys.
<U> PTable<K,U>
mapValues(MapFn<V,U> mapFn, PType<U> ptype)
          Returns a PTable that has the same keys as this instance, but uses the given function to map the values.
<U> PTable<K,U>
mapValues(String name, MapFn<V,U> mapFn, PType<U> ptype)
          Returns a PTable that has the same keys as this instance, but uses the given function to map the values.
 Map<K,V> materializeToMap()
          Returns a Map made up of the keys and values in this PTable.
 PTable<K,V> top(int count)
          Returns a PTable made up of the pairs in this PTable with the largest value field.
 PTable<K,V> union(PTable<K,V>... others)
          Returns a PTable instance that acts as the union of this PTable and the input PTables.
 PTable<K,V> union(PTable<K,V> other)
          Returns a PTable instance that acts as the union of this PTable and the other PTables.
 PCollection<V> values()
          Returns a PCollection made up of the values in this PTable.
 PTable<K,V> write(Target target)
          Write the contents of this PCollection to the given Target, using the storage format specified by the target.
 PTable<K,V> write(Target target, Target.WriteMode writeMode)
          Write the contents of this PCollection to the given Target, using the given Target.WriteMode to handle existing targets.
 
Methods inherited from class org.apache.crunch.impl.dist.collect.PCollectionImpl
accept, aggregate, asCollection, asReadable, by, by, count, first, getDepth, getLastModifiedAt, getMaterializedAt, getName, getOnlyParent, getParallelDoOptions, getParents, getPipeline, getSize, getTargetDependencies, getTypeFamily, isBreakpoint, length, materialize, materializeAt, max, min, parallelDo, parallelDo, parallelDo, parallelDo, parallelDo, parallelDo, setBreakpoint, toString, union, union
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, wait, wait, wait
 
Methods inherited from interface org.apache.crunch.PTable
getPTableType
 
Methods inherited from interface org.apache.crunch.PCollection
aggregate, asCollection, asReadable, by, by, count, first, getName, getPipeline, getPType, getSize, getTypeFamily, length, materialize, max, min, parallelDo, parallelDo, parallelDo, parallelDo, parallelDo, parallelDo, union, union
 

Constructor Detail

PTableBase

public PTableBase(String name,
                  DistributedPipeline pipeline)

PTableBase

public PTableBase(String name,
                  DistributedPipeline pipeline,
                  ParallelDoOptions options)
Method Detail

getKeyType

public PType<K> getKeyType()
Description copied from interface: PTable
Returns the PType of the key.

Specified by:
getKeyType in interface PTable<K,V>

getValueType

public PType<V> getValueType()
Description copied from interface: PTable
Returns the PType of the value.

Specified by:
getValueType in interface PTable<K,V>

groupByKey

public BaseGroupedTable<K,V> groupByKey()
Description copied from interface: PTable
Performs a grouping operation on the keys of this table.

Specified by:
groupByKey in interface PTable<K,V>
Returns:
a PGroupedTable instance that represents the grouping

groupByKey

public BaseGroupedTable<K,V> groupByKey(int numReduceTasks)
Description copied from interface: PTable
Performs a grouping operation on the keys of this table, using the given number of partitions.

Specified by:
groupByKey in interface PTable<K,V>
Parameters:
numReduceTasks - The number of partitions for the data.
Returns:
a PGroupedTable instance that represents this grouping

groupByKey

public BaseGroupedTable<K,V> groupByKey(GroupingOptions groupingOptions)
Description copied from interface: PTable
Performs a grouping operation on the keys of this table, using the additional GroupingOptions to control how the grouping is executed.

Specified by:
groupByKey in interface PTable<K,V>
Parameters:
groupingOptions - The grouping options to use
Returns:
a PGroupedTable instance that represents the grouping

union

public PTable<K,V> union(PTable<K,V> other)
Description copied from interface: PTable
Returns a PTable instance that acts as the union of this PTable and the other PTables.

Specified by:
union in interface PTable<K,V>

union

public PTable<K,V> union(PTable<K,V>... others)
Description copied from interface: PTable
Returns a PTable instance that acts as the union of this PTable and the input PTables.

Specified by:
union in interface PTable<K,V>

write

public PTable<K,V> write(Target target)
Description copied from interface: PCollection
Write the contents of this PCollection to the given Target, using the storage format specified by the target.

Specified by:
write in interface PCollection<Pair<K,V>>
Specified by:
write in interface PTable<K,V>
Overrides:
write in class PCollectionImpl<Pair<K,V>>
Parameters:
target - The target to write to

write

public PTable<K,V> write(Target target,
                         Target.WriteMode writeMode)
Description copied from interface: PCollection
Write the contents of this PCollection to the given Target, using the given Target.WriteMode to handle existing targets.

Specified by:
write in interface PCollection<Pair<K,V>>
Specified by:
write in interface PTable<K,V>
Overrides:
write in class PCollectionImpl<Pair<K,V>>
Parameters:
target - The target
writeMode - The rule for handling existing outputs at the target location

cache

public PTable<K,V> cache()
Description copied from interface: PCollection
Marks this data as cached using the default CachingOptions. Cached PCollections will only be processed once, and then their contents will be saved so that downstream code can process them many times.

Specified by:
cache in interface PCollection<Pair<K,V>>
Specified by:
cache in interface PTable<K,V>
Overrides:
cache in class PCollectionImpl<Pair<K,V>>
Returns:
this PCollection instance

cache

public PTable<K,V> cache(CachingOptions options)
Description copied from interface: PCollection
Marks this data as cached using the given CachingOptions. Cached PCollections will only be processed once and then their contents will be saved so that downstream code can process them many times.

Specified by:
cache in interface PCollection<Pair<K,V>>
Specified by:
cache in interface PTable<K,V>
Overrides:
cache in class PCollectionImpl<Pair<K,V>>
Parameters:
options - the options that control the cache settings for the data
Returns:
this PCollection instance

filter

public PTable<K,V> filter(FilterFn<Pair<K,V>> filterFn)
Description copied from interface: PCollection
Apply the given filter function to this instance and return the resulting PCollection.

Specified by:
filter in interface PCollection<Pair<K,V>>
Specified by:
filter in interface PTable<K,V>
Overrides:
filter in class PCollectionImpl<Pair<K,V>>

filter

public PTable<K,V> filter(String name,
                          FilterFn<Pair<K,V>> filterFn)
Description copied from interface: PCollection
Apply the given filter function to this instance and return the resulting PCollection.

Specified by:
filter in interface PCollection<Pair<K,V>>
Specified by:
filter in interface PTable<K,V>
Overrides:
filter in class PCollectionImpl<Pair<K,V>>
Parameters:
name - An identifier for this processing step
filterFn - The FilterFn to apply

mapValues

public <U> PTable<K,U> mapValues(MapFn<V,U> mapFn,
                                 PType<U> ptype)
Description copied from interface: PTable
Returns a PTable that has the same keys as this instance, but uses the given function to map the values.

Specified by:
mapValues in interface PTable<K,V>

mapValues

public <U> PTable<K,U> mapValues(String name,
                                 MapFn<V,U> mapFn,
                                 PType<U> ptype)
Description copied from interface: PTable
Returns a PTable that has the same keys as this instance, but uses the given function to map the values.

Specified by:
mapValues in interface PTable<K,V>

mapKeys

public <K2> PTable<K2,V> mapKeys(MapFn<K,K2> mapFn,
                                 PType<K2> ptype)
Description copied from interface: PTable
Returns a PTable that has the same values as this instance, but uses the given function to map the keys.

Specified by:
mapKeys in interface PTable<K,V>

mapKeys

public <K2> PTable<K2,V> mapKeys(String name,
                                 MapFn<K,K2> mapFn,
                                 PType<K2> ptype)
Description copied from interface: PTable
Returns a PTable that has the same values as this instance, but uses the given function to map the keys.

Specified by:
mapKeys in interface PTable<K,V>

top

public PTable<K,V> top(int count)
Description copied from interface: PTable
Returns a PTable made up of the pairs in this PTable with the largest value field.

Specified by:
top in interface PTable<K,V>
Parameters:
count - The number of pairs to return

bottom

public PTable<K,V> bottom(int count)
Description copied from interface: PTable
Returns a PTable made up of the pairs in this PTable with the smallest value field.

Specified by:
bottom in interface PTable<K,V>
Parameters:
count - The number of pairs to return

collectValues

public PTable<K,Collection<V>> collectValues()
Description copied from interface: PTable
Aggregate all of the values with the same key into a single key-value pair in the returned PTable.

Specified by:
collectValues in interface PTable<K,V>

join

public <U> PTable<K,Pair<V,U>> join(PTable<K,U> other)
Description copied from interface: PTable
Perform an inner join on this table and the one passed in as an argument on their common keys.

Specified by:
join in interface PTable<K,V>

cogroup

public <U> PTable<K,Pair<Collection<V>,Collection<U>>> cogroup(PTable<K,U> other)
Description copied from interface: PTable
Co-group operation with the given table on common keys.

Specified by:
cogroup in interface PTable<K,V>

keys

public PCollection<K> keys()
Description copied from interface: PTable
Returns a PCollection made up of the keys in this PTable.

Specified by:
keys in interface PTable<K,V>

values

public PCollection<V> values()
Description copied from interface: PTable
Returns a PCollection made up of the values in this PTable.

Specified by:
values in interface PTable<K,V>

materializeToMap

public Map<K,V> materializeToMap()
Returns a Map made up of the keys and values in this PTable.

Specified by:
materializeToMap in interface PTable<K,V>

asMap

public PObject<Map<K,V>> asMap()
Returns a PObject encapsulating a Map made up of the keys and values in this PTable.

Note:The contents of the returned map may not be exactly the same as this PTable, as a PTable is a multi-map (i.e. can contain multiple values for a single key).

Specified by:
asMap in interface PTable<K,V>
Returns:
The PObject encapsulating a Map made up of the keys and values in this PTable.


Copyright © 2014 The Apache Software Foundation. All Rights Reserved.