This project has retired. For details please refer to its Attic page.
PTable (Apache Crunch 0.4.0-incubating API)

org.apache.crunch
Interface PTable<K,V>

All Superinterfaces:
PCollection<Pair<K,V>>

public interface PTable<K,V>
extends PCollection<Pair<K,V>>

A sub-interface of PCollection that represents an immutable, distributed multi-map of keys and values.


Method Summary
 PObject<Map<K,V>> asMap()
          Returns a PObject encapsulating a Map made up of the keys and values in this PTable.
 PTable<K,V> bottom(int count)
          Returns a PTable made up of the pairs in this PTable with the smallest value field.
<U> PTable<K,Pair<Collection<V>,Collection<U>>>
cogroup(PTable<K,U> other)
          Co-group operation with the given table on common keys.
 PTable<K,Collection<V>> collectValues()
          Aggregate all of the values with the same key into a single key-value pair in the returned PTable.
 PType<K> getKeyType()
          Returns the PType of the key.
 PTableType<K,V> getPTableType()
          Returns the PTableType of this PTable.
 PType<V> getValueType()
          Returns the PType of the value.
 PGroupedTable<K,V> groupByKey()
          Performs a grouping operation on the keys of this table.
 PGroupedTable<K,V> groupByKey(GroupingOptions options)
          Performs a grouping operation on the keys of this table, using the additional GroupingOptions to control how the grouping is executed.
 PGroupedTable<K,V> groupByKey(int numPartitions)
          Performs a grouping operation on the keys of this table, using the given number of partitions.
<U> PTable<K,Pair<V,U>>
join(PTable<K,U> other)
          Perform an inner join on this table and the one passed in as an argument on their common keys.
 PCollection<K> keys()
          Returns a PCollection made up of the keys in this PTable.
 Map<K,V> materializeToMap()
          Returns a Map made up of the keys and values in this PTable.
 PTable<K,V> top(int count)
          Returns a PTable made up of the pairs in this PTable with the largest value field.
 PTable<K,V> union(PTable<K,V>... others)
          Returns a PTable instance that acts as the union of this PTable and the input PTables.
 PCollection<V> values()
          Returns a PCollection made up of the values in this PTable.
 PTable<K,V> write(Target target)
          Writes this PTable to the given Target.
 
Methods inherited from interface org.apache.crunch.PCollection
asCollection, by, by, count, filter, filter, getName, getPipeline, getPType, getSize, getTypeFamily, length, materialize, max, min, parallelDo, parallelDo, parallelDo, parallelDo, sample, sample, sort, union
 

Method Detail

union

PTable<K,V> union(PTable<K,V>... others)
Returns a PTable instance that acts as the union of this PTable and the input PTables.


groupByKey

PGroupedTable<K,V> groupByKey()
Performs a grouping operation on the keys of this table.

Returns:
a PGroupedTable instance that represents the grouping

groupByKey

PGroupedTable<K,V> groupByKey(int numPartitions)
Performs a grouping operation on the keys of this table, using the given number of partitions.

Parameters:
numPartitions - The number of partitions for the data.
Returns:
a PGroupedTable instance that represents this grouping

groupByKey

PGroupedTable<K,V> groupByKey(GroupingOptions options)
Performs a grouping operation on the keys of this table, using the additional GroupingOptions to control how the grouping is executed.

Parameters:
options - The grouping options to use
Returns:
a PGroupedTable instance that represents the grouping

write

PTable<K,V> write(Target target)
Writes this PTable to the given Target.

Specified by:
write in interface PCollection<Pair<K,V>>
Parameters:
target - The target to write to

getPTableType

PTableType<K,V> getPTableType()
Returns the PTableType of this PTable.


getKeyType

PType<K> getKeyType()
Returns the PType of the key.


getValueType

PType<V> getValueType()
Returns the PType of the value.


collectValues

PTable<K,Collection<V>> collectValues()
Aggregate all of the values with the same key into a single key-value pair in the returned PTable.


top

PTable<K,V> top(int count)
Returns a PTable made up of the pairs in this PTable with the largest value field.

Parameters:
count - The number of pairs to return

bottom

PTable<K,V> bottom(int count)
Returns a PTable made up of the pairs in this PTable with the smallest value field.

Parameters:
count - The number of pairs to return

join

<U> PTable<K,Pair<V,U>> join(PTable<K,U> other)
Perform an inner join on this table and the one passed in as an argument on their common keys.


cogroup

<U> PTable<K,Pair<Collection<V>,Collection<U>>> cogroup(PTable<K,U> other)
Co-group operation with the given table on common keys.


keys

PCollection<K> keys()
Returns a PCollection made up of the keys in this PTable.


values

PCollection<V> values()
Returns a PCollection made up of the values in this PTable.


materializeToMap

Map<K,V> materializeToMap()
Returns a Map made up of the keys and values in this PTable.

Note: The contents of the returned map may not be exactly the same as this PTable, as a PTable is a multi-map (i.e. can contain multiple values for a single key).


asMap

PObject<Map<K,V>> asMap()
Returns a PObject encapsulating a Map made up of the keys and values in this PTable.

Note:The contents of the returned map may not be exactly the same as this PTable, as a PTable is a multi-map (i.e. can contain multiple values for a single key).

Returns:
The PObject encapsulating a Map made up of the keys and values in this PTable.


Copyright © 2012 The Apache Software Foundation. All Rights Reserved.