This project has retired. For details please refer to its Attic page.
BaseGroupedTable (Apache Crunch 0.11.0 API)

org.apache.crunch.impl.dist.collect
Class BaseGroupedTable<K,V>

java.lang.Object
  extended by org.apache.crunch.impl.dist.collect.PCollectionImpl<Pair<K,Iterable<V>>>
      extended by org.apache.crunch.impl.dist.collect.BaseGroupedTable<K,V>
All Implemented Interfaces:
PCollection<Pair<K,Iterable<V>>>, PGroupedTable<K,V>
Direct Known Subclasses:
PGroupedTableImpl

public class BaseGroupedTable<K,V>
extends PCollectionImpl<Pair<K,Iterable<V>>>
implements PGroupedTable<K,V>


Nested Class Summary
 
Nested classes/interfaces inherited from class org.apache.crunch.impl.dist.collect.PCollectionImpl
PCollectionImpl.Visitor
 
Method Summary
 PTable<K,V> combineValues(Aggregator<V> agg)
          Combine the values in each group using the given Aggregator.
 PTable<K,V> combineValues(Aggregator<V> combineAgg, Aggregator<V> reduceAgg)
          Combine and reduces the values in each group using the given Aggregator instances.
 PTable<K,V> combineValues(CombineFn<K,V> combineFn)
          Combines the values of this grouping using the given CombineFn.
 PTable<K,V> combineValues(CombineFn<K,V> combineFn, CombineFn<K,V> reduceFn)
          Combines and reduces the values of this grouping using the given CombineFn instances.
 PGroupedTableType<K,V> getGroupedTableType()
          Return the PGroupedTableType containing serialization information for this PGroupedTable.
 long getLastModifiedAt()
          The time of the most recent modification to one of the input sources to the collection.
 List<PCollectionImpl<?>> getParents()
           
 PType<Pair<K,Iterable<V>>> getPType()
          Returns the PType of this PCollection.
 Set<Target> getTargetDependencies()
           
<U> PTable<K,U>
mapValues(MapFn<Iterable<V>,U> mapFn, PType<U> ptype)
          Maps the Iterable<V> elements of each record to a new type.
<U> PTable<K,U>
mapValues(String name, MapFn<Iterable<V>,U> mapFn, PType<U> ptype)
          Maps the Iterable<V> elements of each record to a new type.
 PTable<K,V> ungroup()
          Convert this grouping back into a multimap.
 
Methods inherited from class org.apache.crunch.impl.dist.collect.PCollectionImpl
accept, aggregate, asCollection, asReadable, by, by, cache, cache, count, filter, filter, first, getDepth, getMaterializedAt, getName, getOnlyParent, getParallelDoOptions, getPipeline, getSize, getTypeFamily, isBreakpoint, length, materialize, materializeAt, max, min, parallelDo, parallelDo, parallelDo, parallelDo, parallelDo, parallelDo, sequentialDo, setBreakpoint, toString, union, union, write, write
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, wait, wait, wait
 
Methods inherited from interface org.apache.crunch.PCollection
aggregate, asCollection, asReadable, by, by, cache, cache, count, filter, filter, first, getName, getPipeline, getSize, getTypeFamily, length, materialize, max, min, parallelDo, parallelDo, parallelDo, parallelDo, parallelDo, parallelDo, sequentialDo, union, union, write, write
 

Method Detail

getPType

public PType<Pair<K,Iterable<V>>> getPType()
Description copied from interface: PCollection
Returns the PType of this PCollection.

Specified by:
getPType in interface PCollection<Pair<K,Iterable<V>>>

combineValues

public PTable<K,V> combineValues(CombineFn<K,V> combineFn,
                                 CombineFn<K,V> reduceFn)
Description copied from interface: PGroupedTable
Combines and reduces the values of this grouping using the given CombineFn instances.

Specified by:
combineValues in interface PGroupedTable<K,V>
Parameters:
combineFn - The combiner function during the combine phase
reduceFn - The combiner function during the reduce phase
Returns:
A PTable where each key has a single value

combineValues

public PTable<K,V> combineValues(CombineFn<K,V> combineFn)
Description copied from interface: PGroupedTable
Combines the values of this grouping using the given CombineFn.

Specified by:
combineValues in interface PGroupedTable<K,V>
Parameters:
combineFn - The combiner function
Returns:
A PTable where each key has a single value

combineValues

public PTable<K,V> combineValues(Aggregator<V> agg)
Description copied from interface: PGroupedTable
Combine the values in each group using the given Aggregator.

Specified by:
combineValues in interface PGroupedTable<K,V>
Parameters:
agg - The function to use
Returns:
A PTable where each group key maps to an aggregated value. Group keys may be repeated if an aggregator returns more than one value.

combineValues

public PTable<K,V> combineValues(Aggregator<V> combineAgg,
                                 Aggregator<V> reduceAgg)
Description copied from interface: PGroupedTable
Combine and reduces the values in each group using the given Aggregator instances.

Specified by:
combineValues in interface PGroupedTable<K,V>
Parameters:
combineAgg - The aggregator to use during the combine phase
reduceAgg - The aggregator to use during the reduce phase
Returns:
A PTable where each group key maps to an aggregated value. Group keys may be repeated if an aggregator returns more than one value.

ungroup

public PTable<K,V> ungroup()
Description copied from interface: PGroupedTable
Convert this grouping back into a multimap.

Specified by:
ungroup in interface PGroupedTable<K,V>
Returns:
an ungrouped version of the data in this PGroupedTable.

mapValues

public <U> PTable<K,U> mapValues(MapFn<Iterable<V>,U> mapFn,
                                 PType<U> ptype)
Description copied from interface: PGroupedTable
Maps the Iterable<V> elements of each record to a new type. Just like any parallelDo operation on a PGroupedTable, this may only be called once.

Specified by:
mapValues in interface PGroupedTable<K,V>
Parameters:
mapFn - The mapping function
ptype - The serialization information for the returned data
Returns:
A new PTable instance

mapValues

public <U> PTable<K,U> mapValues(String name,
                                 MapFn<Iterable<V>,U> mapFn,
                                 PType<U> ptype)
Description copied from interface: PGroupedTable
Maps the Iterable<V> elements of each record to a new type. Just like any parallelDo operation on a PGroupedTable, this may only be called once.

Specified by:
mapValues in interface PGroupedTable<K,V>
Parameters:
name - A name for this operation
mapFn - The mapping function
ptype - The serialization information for the returned data
Returns:
A new PTable instance

getGroupedTableType

public PGroupedTableType<K,V> getGroupedTableType()
Description copied from interface: PGroupedTable
Return the PGroupedTableType containing serialization information for this PGroupedTable.

Specified by:
getGroupedTableType in interface PGroupedTable<K,V>

getTargetDependencies

public Set<Target> getTargetDependencies()
Overrides:
getTargetDependencies in class PCollectionImpl<Pair<K,Iterable<V>>>

getParents

public List<PCollectionImpl<?>> getParents()
Specified by:
getParents in class PCollectionImpl<Pair<K,Iterable<V>>>

getLastModifiedAt

public long getLastModifiedAt()
Description copied from class: PCollectionImpl
The time of the most recent modification to one of the input sources to the collection. If the time can not be determined then -1 should be returned.

Specified by:
getLastModifiedAt in class PCollectionImpl<Pair<K,Iterable<V>>>
Returns:
time of the most recent modification to one of the input sources to the collection.


Copyright © 2014 The Apache Software Foundation. All Rights Reserved.