public abstract class PTableBase<K,V> extends PCollectionImpl<Pair<K,V>> implements PTable<K,V>
PCollectionImpl.Visitor
Constructor and Description |
---|
PTableBase(String name,
DistributedPipeline pipeline) |
PTableBase(String name,
DistributedPipeline pipeline,
ParallelDoOptions options) |
Modifier and Type | Method and Description |
---|---|
PObject<Map<K,V>> |
asMap()
|
PTable<K,V> |
bottom(int count)
Returns a PTable made up of the pairs in this PTable with the smallest
value field.
|
PTable<K,V> |
cache()
Marks this data as cached using the default
CachingOptions . |
PTable<K,V> |
cache(CachingOptions options)
Marks this data as cached using the given
CachingOptions . |
<U> PTable<K,Pair<Collection<V>,Collection<U>>> |
cogroup(PTable<K,U> other)
Co-group operation with the given table on common keys.
|
PTable<K,Collection<V>> |
collectValues()
Aggregate all of the values with the same key into a single key-value pair
in the returned PTable.
|
PTable<K,V> |
filter(FilterFn<Pair<K,V>> filterFn)
Apply the given filter function to this instance and return the resulting
PCollection . |
PTable<K,V> |
filter(String name,
FilterFn<Pair<K,V>> filterFn)
Apply the given filter function to this instance and return the resulting
PCollection . |
PType<K> |
getKeyType()
Returns the
PType of the key. |
PType<V> |
getValueType()
Returns the
PType of the value. |
BaseGroupedTable<K,V> |
groupByKey()
Performs a grouping operation on the keys of this table.
|
BaseGroupedTable<K,V> |
groupByKey(GroupingOptions groupingOptions)
Performs a grouping operation on the keys of this table, using the
additional
GroupingOptions to control how the grouping is executed. |
BaseGroupedTable<K,V> |
groupByKey(int numReduceTasks)
Performs a grouping operation on the keys of this table, using the given
number of partitions.
|
<U> PTable<K,Pair<V,U>> |
join(PTable<K,U> other)
Perform an inner join on this table and the one passed in as an argument on
their common keys.
|
PCollection<K> |
keys()
Returns a
PCollection made up of the keys in this PTable. |
<K2> PTable<K2,V> |
mapKeys(MapFn<K,K2> mapFn,
PType<K2> ptype)
Returns a
PTable that has the same values as this instance, but
uses the given function to map the keys. |
<K2> PTable<K2,V> |
mapKeys(String name,
MapFn<K,K2> mapFn,
PType<K2> ptype)
Returns a
PTable that has the same values as this instance, but
uses the given function to map the keys. |
<U> PTable<K,U> |
mapValues(MapFn<V,U> mapFn,
PType<U> ptype)
Returns a
PTable that has the same keys as this instance, but
uses the given function to map the values. |
<U> PTable<K,U> |
mapValues(String name,
MapFn<V,U> mapFn,
PType<U> ptype)
Returns a
PTable that has the same keys as this instance, but
uses the given function to map the values. |
Map<K,V> |
materializeToMap()
Returns a Map
|
PTable<K,V> |
top(int count)
Returns a PTable made up of the pairs in this PTable with the largest value
field.
|
PTable<K,V> |
union(PTable<K,V>... others)
Returns a
PTable instance that acts as the union of this
PTable and the input PTable s. |
PTable<K,V> |
union(PTable<K,V> other)
Returns a
PTable instance that acts as the union of this
PTable and the other PTable s. |
PCollection<V> |
values()
Returns a
PCollection made up of the values in this PTable. |
PTable<K,V> |
write(Target target)
Write the contents of this
PCollection to the given Target ,
using the storage format specified by the target. |
PTable<K,V> |
write(Target target,
Target.WriteMode writeMode)
Write the contents of this
PCollection to the given Target ,
using the given Target.WriteMode to handle existing
targets. |
accept, aggregate, asCollection, asReadable, by, by, count, first, getDepth, getLastModifiedAt, getMaterializedAt, getName, getOnlyParent, getParallelDoOptions, getParents, getPipeline, getSize, getTargetDependencies, getTypeFamily, isBreakpoint, length, materialize, materializeAt, max, min, parallelDo, parallelDo, parallelDo, parallelDo, parallelDo, parallelDo, sequentialDo, setBreakpoint, toString, union, union
equals, getClass, hashCode, notify, notifyAll, wait, wait, wait
getPTableType
aggregate, asCollection, asReadable, by, by, count, first, getName, getPipeline, getPType, getSize, getTypeFamily, length, materialize, max, min, parallelDo, parallelDo, parallelDo, parallelDo, parallelDo, parallelDo, sequentialDo, union, union
public PTableBase(String name, DistributedPipeline pipeline)
public PTableBase(String name, DistributedPipeline pipeline, ParallelDoOptions options)
public PType<K> getKeyType()
PTable
PType
of the key.getKeyType
in interface PTable<K,V>
public PType<V> getValueType()
PTable
PType
of the value.getValueType
in interface PTable<K,V>
public BaseGroupedTable<K,V> groupByKey()
PTable
groupByKey
in interface PTable<K,V>
PGroupedTable
instance that represents the groupingpublic BaseGroupedTable<K,V> groupByKey(int numReduceTasks)
PTable
groupByKey
in interface PTable<K,V>
numReduceTasks
- The number of partitions for the data.PGroupedTable
instance that represents this groupingpublic BaseGroupedTable<K,V> groupByKey(GroupingOptions groupingOptions)
PTable
GroupingOptions
to control how the grouping is executed.groupByKey
in interface PTable<K,V>
groupingOptions
- The grouping options to usePGroupedTable
instance that represents the groupingpublic PTable<K,V> union(PTable<K,V> other)
PTable
PTable
instance that acts as the union of this
PTable
and the other PTable
s.public PTable<K,V> union(PTable<K,V>... others)
PTable
PTable
instance that acts as the union of this
PTable
and the input PTable
s.public PTable<K,V> write(Target target)
PCollection
PCollection
to the given Target
,
using the storage format specified by the target.public PTable<K,V> write(Target target, Target.WriteMode writeMode)
PCollection
PCollection
to the given Target
,
using the given Target.WriteMode
to handle existing
targets.public PTable<K,V> cache()
PCollection
CachingOptions
. Cached PCollection
s will only
be processed once, and then their contents will be saved so that downstream code can process them many times.public PTable<K,V> cache(CachingOptions options)
PCollection
CachingOptions
. Cached PCollection
s will only
be processed once and then their contents will be saved so that downstream code can process them many times.public PTable<K,V> filter(FilterFn<Pair<K,V>> filterFn)
PCollection
PCollection
.public PTable<K,V> filter(String name, FilterFn<Pair<K,V>> filterFn)
PCollection
PCollection
.public <U> PTable<K,U> mapValues(MapFn<V,U> mapFn, PType<U> ptype)
PTable
PTable
that has the same keys as this instance, but
uses the given function to map the values.public <U> PTable<K,U> mapValues(String name, MapFn<V,U> mapFn, PType<U> ptype)
PTable
PTable
that has the same keys as this instance, but
uses the given function to map the values.public <K2> PTable<K2,V> mapKeys(MapFn<K,K2> mapFn, PType<K2> ptype)
PTable
PTable
that has the same values as this instance, but
uses the given function to map the keys.public <K2> PTable<K2,V> mapKeys(String name, MapFn<K,K2> mapFn, PType<K2> ptype)
PTable
PTable
that has the same values as this instance, but
uses the given function to map the keys.public PTable<K,V> top(int count)
PTable
public PTable<K,V> bottom(int count)
PTable
public PTable<K,Collection<V>> collectValues()
PTable
collectValues
in interface PTable<K,V>
public <U> PTable<K,Pair<V,U>> join(PTable<K,U> other)
PTable
public <U> PTable<K,Pair<Collection<V>,Collection<U>>> cogroup(PTable<K,U> other)
PTable
public PCollection<K> keys()
PTable
PCollection
made up of the keys in this PTable.public PCollection<V> values()
PTable
PCollection
made up of the values in this PTable.public Map<K,V> materializeToMap()
materializeToMap
in interface PTable<K,V>
Copyright © 2016 The Apache Software Foundation. All rights reserved.