This project has retired. For details please refer to its Attic page.
MemTable (Apache Crunch 0.3.0-incubating API)

org.apache.crunch.impl.mem.collect
Class MemTable<K,V>

java.lang.Object
  extended by org.apache.crunch.impl.mem.collect.MemCollection<Pair<K,V>>
      extended by org.apache.crunch.impl.mem.collect.MemTable<K,V>
All Implemented Interfaces:
PCollection<Pair<K,V>>, PTable<K,V>

public class MemTable<K,V>
extends MemCollection<Pair<K,V>>
implements PTable<K,V>


Constructor Summary
MemTable(Iterable<Pair<K,V>> collect)
           
MemTable(Iterable<Pair<K,V>> collect, PTableType<K,V> ptype, String name)
           
 
Method Summary
 PTable<K,V> bottom(int count)
          Returns a PTable made up of the pairs in this PTable with the smallest value field.
<U> PTable<K,Pair<Collection<V>,Collection<U>>>
cogroup(PTable<K,U> other)
          Co-group operation with the given table on common keys.
 PTable<K,Collection<V>> collectValues()
          Aggregate all of the values with the same key into a single key-value pair in the returned PTable.
 PType<K> getKeyType()
          Returns the PType of the key.
 PTableType<K,V> getPTableType()
          Returns the PTableType of this PTable.
 PType<V> getValueType()
          Returns the PType of the value.
 PGroupedTable<K,V> groupByKey()
          Performs a grouping operation on the keys of this table.
 PGroupedTable<K,V> groupByKey(GroupingOptions options)
          Performs a grouping operation on the keys of this table, using the additional GroupingOptions to control how the grouping is executed.
 PGroupedTable<K,V> groupByKey(int numPartitions)
          Performs a grouping operation on the keys of this table, using the given number of partitions.
<U> PTable<K,Pair<V,U>>
join(PTable<K,U> other)
          Perform an inner join on this table and the one passed in as an argument on their common keys.
 PCollection<K> keys()
          Returns a PCollection made up of the keys in this PTable.
 Map<K,V> materializeToMap()
          Returns a Map made up of the keys and values in this PTable.
 PTable<K,V> top(int count)
          Returns a PTable made up of the pairs in this PTable with the largest value field.
 PTable<K,V> union(PTable<K,V>... others)
          Returns a PTable instance that acts as the union of this PTable and the input PTables.
 PCollection<V> values()
          Returns a PCollection made up of the values in this PTable.
 PTable<K,V> write(Target target)
          Write the contents of this PCollection to the given Target, using the storage format specified by the target.
 
Methods inherited from class org.apache.crunch.impl.mem.collect.MemCollection
by, by, count, filter, filter, getCollection, getName, getPipeline, getPType, getSize, getTypeFamily, materialize, max, min, parallelDo, parallelDo, parallelDo, parallelDo, sample, sample, sort, toString, union
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, wait, wait, wait
 
Methods inherited from interface org.apache.crunch.PCollection
by, by, count, filter, filter, getName, getPipeline, getPType, getSize, getTypeFamily, materialize, max, min, parallelDo, parallelDo, parallelDo, parallelDo, sample, sample, sort, union
 

Constructor Detail

MemTable

public MemTable(Iterable<Pair<K,V>> collect)

MemTable

public MemTable(Iterable<Pair<K,V>> collect,
                PTableType<K,V> ptype,
                String name)
Method Detail

union

public PTable<K,V> union(PTable<K,V>... others)
Description copied from interface: PTable
Returns a PTable instance that acts as the union of this PTable and the input PTables.

Specified by:
union in interface PTable<K,V>

groupByKey

public PGroupedTable<K,V> groupByKey()
Description copied from interface: PTable
Performs a grouping operation on the keys of this table.

Specified by:
groupByKey in interface PTable<K,V>
Returns:
a PGroupedTable instance that represents the grouping

groupByKey

public PGroupedTable<K,V> groupByKey(int numPartitions)
Description copied from interface: PTable
Performs a grouping operation on the keys of this table, using the given number of partitions.

Specified by:
groupByKey in interface PTable<K,V>
Parameters:
numPartitions - The number of partitions for the data.
Returns:
a PGroupedTable instance that represents this grouping

groupByKey

public PGroupedTable<K,V> groupByKey(GroupingOptions options)
Description copied from interface: PTable
Performs a grouping operation on the keys of this table, using the additional GroupingOptions to control how the grouping is executed.

Specified by:
groupByKey in interface PTable<K,V>
Parameters:
options - The grouping options to use
Returns:
a PGroupedTable instance that represents the grouping

write

public PTable<K,V> write(Target target)
Description copied from interface: PCollection
Write the contents of this PCollection to the given Target, using the storage format specified by the target.

Specified by:
write in interface PCollection<Pair<K,V>>
Specified by:
write in interface PTable<K,V>
Overrides:
write in class MemCollection<Pair<K,V>>
Parameters:
target - The target to write to

getPTableType

public PTableType<K,V> getPTableType()
Description copied from interface: PTable
Returns the PTableType of this PTable.

Specified by:
getPTableType in interface PTable<K,V>

getKeyType

public PType<K> getKeyType()
Description copied from interface: PTable
Returns the PType of the key.

Specified by:
getKeyType in interface PTable<K,V>

getValueType

public PType<V> getValueType()
Description copied from interface: PTable
Returns the PType of the value.

Specified by:
getValueType in interface PTable<K,V>

top

public PTable<K,V> top(int count)
Description copied from interface: PTable
Returns a PTable made up of the pairs in this PTable with the largest value field.

Specified by:
top in interface PTable<K,V>
Parameters:
count - The number of pairs to return

bottom

public PTable<K,V> bottom(int count)
Description copied from interface: PTable
Returns a PTable made up of the pairs in this PTable with the smallest value field.

Specified by:
bottom in interface PTable<K,V>
Parameters:
count - The number of pairs to return

collectValues

public PTable<K,Collection<V>> collectValues()
Description copied from interface: PTable
Aggregate all of the values with the same key into a single key-value pair in the returned PTable.

Specified by:
collectValues in interface PTable<K,V>

join

public <U> PTable<K,Pair<V,U>> join(PTable<K,U> other)
Description copied from interface: PTable
Perform an inner join on this table and the one passed in as an argument on their common keys.

Specified by:
join in interface PTable<K,V>

cogroup

public <U> PTable<K,Pair<Collection<V>,Collection<U>>> cogroup(PTable<K,U> other)
Description copied from interface: PTable
Co-group operation with the given table on common keys.

Specified by:
cogroup in interface PTable<K,V>

keys

public PCollection<K> keys()
Description copied from interface: PTable
Returns a PCollection made up of the keys in this PTable.

Specified by:
keys in interface PTable<K,V>

values

public PCollection<V> values()
Description copied from interface: PTable
Returns a PCollection made up of the values in this PTable.

Specified by:
values in interface PTable<K,V>

materializeToMap

public Map<K,V> materializeToMap()
Description copied from interface: PTable
Returns a Map made up of the keys and values in this PTable.

Note: The contents of the returned map may not be exactly the same as this PTable, as a PTable is a multi-map (i.e. can contain multiple values for a single key).

Specified by:
materializeToMap in interface PTable<K,V>


Copyright © 2012 The Apache Software Foundation. All Rights Reserved.