public abstract class PTableBase<K,V> extends PCollectionImpl<Pair<K,V>> implements PTable<K,V>
PCollectionImpl.Visitor| Constructor and Description |
|---|
PTableBase(String name,
DistributedPipeline pipeline) |
PTableBase(String name,
DistributedPipeline pipeline,
ParallelDoOptions options) |
| Modifier and Type | Method and Description |
|---|---|
PObject<Map<K,V>> |
asMap()
|
PTable<K,V> |
bottom(int count)
Returns a PTable made up of the pairs in this PTable with the smallest
value field.
|
PTable<K,V> |
cache()
Marks this data as cached using the default
CachingOptions. |
PTable<K,V> |
cache(CachingOptions options)
Marks this data as cached using the given
CachingOptions. |
<U> PTable<K,Pair<Collection<V>,Collection<U>>> |
cogroup(PTable<K,U> other)
Co-group operation with the given table on common keys.
|
PTable<K,Collection<V>> |
collectValues()
Aggregate all of the values with the same key into a single key-value pair
in the returned PTable.
|
PTable<K,V> |
filter(FilterFn<Pair<K,V>> filterFn)
Apply the given filter function to this instance and return the resulting
PCollection. |
PTable<K,V> |
filter(String name,
FilterFn<Pair<K,V>> filterFn)
Apply the given filter function to this instance and return the resulting
PCollection. |
PType<K> |
getKeyType()
Returns the
PType of the key. |
PType<V> |
getValueType()
Returns the
PType of the value. |
BaseGroupedTable<K,V> |
groupByKey()
Performs a grouping operation on the keys of this table.
|
BaseGroupedTable<K,V> |
groupByKey(GroupingOptions groupingOptions)
Performs a grouping operation on the keys of this table, using the
additional
GroupingOptions to control how the grouping is executed. |
BaseGroupedTable<K,V> |
groupByKey(int numReduceTasks)
Performs a grouping operation on the keys of this table, using the given
number of partitions.
|
<U> PTable<K,Pair<V,U>> |
join(PTable<K,U> other)
Perform an inner join on this table and the one passed in as an argument on
their common keys.
|
PCollection<K> |
keys()
Returns a
PCollection made up of the keys in this PTable. |
<K2> PTable<K2,V> |
mapKeys(MapFn<K,K2> mapFn,
PType<K2> ptype)
Returns a
PTable that has the same values as this instance, but
uses the given function to map the keys. |
<K2> PTable<K2,V> |
mapKeys(String name,
MapFn<K,K2> mapFn,
PType<K2> ptype)
Returns a
PTable that has the same values as this instance, but
uses the given function to map the keys. |
<U> PTable<K,U> |
mapValues(MapFn<V,U> mapFn,
PType<U> ptype)
Returns a
PTable that has the same keys as this instance, but
uses the given function to map the values. |
<U> PTable<K,U> |
mapValues(String name,
MapFn<V,U> mapFn,
PType<U> ptype)
Returns a
PTable that has the same keys as this instance, but
uses the given function to map the values. |
Map<K,V> |
materializeToMap()
Returns a Map
|
PTable<K,V> |
top(int count)
Returns a PTable made up of the pairs in this PTable with the largest value
field.
|
PTable<K,V> |
union(PTable<K,V>... others)
Returns a
PTable instance that acts as the union of this
PTable and the input PTables. |
PTable<K,V> |
union(PTable<K,V> other)
Returns a
PTable instance that acts as the union of this
PTable and the other PTables. |
PCollection<V> |
values()
Returns a
PCollection made up of the values in this PTable. |
PTable<K,V> |
write(Target target)
Write the contents of this
PCollection to the given Target,
using the storage format specified by the target. |
PTable<K,V> |
write(Target target,
Target.WriteMode writeMode)
Write the contents of this
PCollection to the given Target,
using the given Target.WriteMode to handle existing
targets. |
accept, aggregate, asCollection, asReadable, by, by, count, first, getDepth, getLastModifiedAt, getMaterializedAt, getName, getOnlyParent, getParallelDoOptions, getParents, getPipeline, getSize, getTargetDependencies, getTypeFamily, isBreakpoint, length, materialize, materializeAt, max, min, parallelDo, parallelDo, parallelDo, parallelDo, parallelDo, parallelDo, sequentialDo, setBreakpoint, toString, union, unionequals, getClass, hashCode, notify, notifyAll, wait, wait, waitgetPTableTypeaggregate, asCollection, asReadable, by, by, count, first, getName, getPipeline, getPType, getSize, getTypeFamily, length, materialize, max, min, parallelDo, parallelDo, parallelDo, parallelDo, parallelDo, parallelDo, sequentialDo, union, unionpublic PTableBase(String name, DistributedPipeline pipeline)
public PTableBase(String name, DistributedPipeline pipeline, ParallelDoOptions options)
public PType<K> getKeyType()
PTablePType of the key.getKeyType in interface PTable<K,V>public PType<V> getValueType()
PTablePType of the value.getValueType in interface PTable<K,V>public BaseGroupedTable<K,V> groupByKey()
PTablegroupByKey in interface PTable<K,V>PGroupedTable instance that represents the groupingpublic BaseGroupedTable<K,V> groupByKey(int numReduceTasks)
PTablegroupByKey in interface PTable<K,V>numReduceTasks - The number of partitions for the data.PGroupedTable instance that represents this groupingpublic BaseGroupedTable<K,V> groupByKey(GroupingOptions groupingOptions)
PTableGroupingOptions to control how the grouping is executed.groupByKey in interface PTable<K,V>groupingOptions - The grouping options to usePGroupedTable instance that represents the groupingpublic PTable<K,V> union(PTable<K,V> other)
PTablePTable instance that acts as the union of this
PTable and the other PTables.public PTable<K,V> union(PTable<K,V>... others)
PTablePTable instance that acts as the union of this
PTable and the input PTables.public PTable<K,V> write(Target target)
PCollectionPCollection to the given Target,
using the storage format specified by the target.public PTable<K,V> write(Target target, Target.WriteMode writeMode)
PCollectionPCollection to the given Target,
using the given Target.WriteMode to handle existing
targets.public PTable<K,V> cache()
PCollectionCachingOptions. Cached PCollections will only
be processed once, and then their contents will be saved so that downstream code can process them many times.public PTable<K,V> cache(CachingOptions options)
PCollectionCachingOptions. Cached PCollections will only
be processed once and then their contents will be saved so that downstream code can process them many times.public PTable<K,V> filter(FilterFn<Pair<K,V>> filterFn)
PCollectionPCollection.public PTable<K,V> filter(String name, FilterFn<Pair<K,V>> filterFn)
PCollectionPCollection.public <U> PTable<K,U> mapValues(MapFn<V,U> mapFn, PType<U> ptype)
PTablePTable that has the same keys as this instance, but
uses the given function to map the values.public <U> PTable<K,U> mapValues(String name, MapFn<V,U> mapFn, PType<U> ptype)
PTablePTable that has the same keys as this instance, but
uses the given function to map the values.public <K2> PTable<K2,V> mapKeys(MapFn<K,K2> mapFn, PType<K2> ptype)
PTablePTable that has the same values as this instance, but
uses the given function to map the keys.public <K2> PTable<K2,V> mapKeys(String name, MapFn<K,K2> mapFn, PType<K2> ptype)
PTablePTable that has the same values as this instance, but
uses the given function to map the keys.public PTable<K,V> top(int count)
PTablepublic PTable<K,V> bottom(int count)
PTablepublic PTable<K,Collection<V>> collectValues()
PTablecollectValues in interface PTable<K,V>public <U> PTable<K,Pair<V,U>> join(PTable<K,U> other)
PTablepublic <U> PTable<K,Pair<Collection<V>,Collection<U>>> cogroup(PTable<K,U> other)
PTablepublic PCollection<K> keys()
PTablePCollection made up of the keys in this PTable.public PCollection<V> values()
PTablePCollection made up of the values in this PTable.public Map<K,V> materializeToMap()
materializeToMap in interface PTable<K,V>Copyright © 2015 The Apache Software Foundation. All Rights Reserved.