|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object org.apache.crunch.impl.dist.collect.PCollectionImpl<Pair<K,V>> org.apache.crunch.impl.dist.collect.PTableBase<K,V>
public abstract class PTableBase<K,V>
Nested Class Summary |
---|
Nested classes/interfaces inherited from class org.apache.crunch.impl.dist.collect.PCollectionImpl |
---|
PCollectionImpl.Visitor |
Field Summary |
---|
Fields inherited from class org.apache.crunch.impl.dist.collect.PCollectionImpl |
---|
doOptions, materializedAt, pipeline |
Constructor Summary | |
---|---|
PTableBase(String name,
DistributedPipeline pipeline)
|
|
PTableBase(String name,
DistributedPipeline pipeline,
ParallelDoOptions options)
|
Method Summary | ||
---|---|---|
PObject<Map<K,V>> |
asMap()
Returns a PObject encapsulating a Map made up of the keys and values in this
PTable . |
|
PTable<K,V> |
bottom(int count)
Returns a PTable made up of the pairs in this PTable with the smallest value field. |
|
PTable<K,V> |
cache()
Marks this data as cached using the default CachingOptions . |
|
PTable<K,V> |
cache(CachingOptions options)
Marks this data as cached using the given CachingOptions . |
|
|
cogroup(PTable<K,U> other)
Co-group operation with the given table on common keys. |
|
PTable<K,Collection<V>> |
collectValues()
Aggregate all of the values with the same key into a single key-value pair in the returned PTable. |
|
PTable<K,V> |
filter(FilterFn<Pair<K,V>> filterFn)
Apply the given filter function to this instance and return the resulting PCollection . |
|
PTable<K,V> |
filter(String name,
FilterFn<Pair<K,V>> filterFn)
Apply the given filter function to this instance and return the resulting PCollection . |
|
PType<K> |
getKeyType()
Returns the PType of the key. |
|
PType<V> |
getValueType()
Returns the PType of the value. |
|
BaseGroupedTable<K,V> |
groupByKey()
Performs a grouping operation on the keys of this table. |
|
BaseGroupedTable<K,V> |
groupByKey(GroupingOptions groupingOptions)
Performs a grouping operation on the keys of this table, using the additional GroupingOptions to control how the grouping is executed. |
|
BaseGroupedTable<K,V> |
groupByKey(int numReduceTasks)
Performs a grouping operation on the keys of this table, using the given number of partitions. |
|
|
join(PTable<K,U> other)
Perform an inner join on this table and the one passed in as an argument on their common keys. |
|
PCollection<K> |
keys()
Returns a PCollection made up of the keys in this PTable. |
|
|
mapKeys(MapFn<K,K2> mapFn,
PType<K2> ptype)
Returns a PTable that has the same values as this instance, but
uses the given function to map the keys. |
|
|
mapKeys(String name,
MapFn<K,K2> mapFn,
PType<K2> ptype)
Returns a PTable that has the same values as this instance, but
uses the given function to map the keys. |
|
|
mapValues(MapFn<V,U> mapFn,
PType<U> ptype)
Returns a PTable that has the same keys as this instance, but
uses the given function to map the values. |
|
|
mapValues(String name,
MapFn<V,U> mapFn,
PType<U> ptype)
Returns a PTable that has the same keys as this instance, but
uses the given function to map the values. |
|
Map<K,V> |
materializeToMap()
Returns a Map |
|
PTable<K,V> |
top(int count)
Returns a PTable made up of the pairs in this PTable with the largest value field. |
|
PTable<K,V> |
union(PTable<K,V>... others)
Returns a PTable instance that acts as the union of this
PTable and the input PTable s. |
|
PTable<K,V> |
union(PTable<K,V> other)
Returns a PTable instance that acts as the union of this
PTable and the other PTable s. |
|
PCollection<V> |
values()
Returns a PCollection made up of the values in this PTable. |
|
PTable<K,V> |
write(Target target)
Write the contents of this PCollection to the given Target ,
using the storage format specified by the target. |
|
PTable<K,V> |
write(Target target,
Target.WriteMode writeMode)
Write the contents of this PCollection to the given Target ,
using the given Target.WriteMode to handle existing
targets. |
Methods inherited from class org.apache.crunch.impl.dist.collect.PCollectionImpl |
---|
accept, acceptInternal, asCollection, asReadable, by, by, count, getChainingCollection, getDepth, getLastModifiedAt, getMaterializedAt, getName, getOnlyParent, getParallelDoOptions, getParents, getPipeline, getReadableDataInternal, getSize, getSizeInternal, getTargetDependencies, getTypeFamily, isBreakpoint, length, materialize, materializeAt, materializedData, max, min, parallelDo, parallelDo, parallelDo, parallelDo, parallelDo, parallelDo, setBreakpoint, toString, union, union |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
Methods inherited from interface org.apache.crunch.PTable |
---|
getPTableType |
Methods inherited from interface org.apache.crunch.PCollection |
---|
asCollection, asReadable, by, by, count, getName, getPipeline, getPType, getSize, getTypeFamily, length, materialize, max, min, parallelDo, parallelDo, parallelDo, parallelDo, parallelDo, parallelDo, union, union |
Constructor Detail |
---|
public PTableBase(String name, DistributedPipeline pipeline)
public PTableBase(String name, DistributedPipeline pipeline, ParallelDoOptions options)
Method Detail |
---|
public PType<K> getKeyType()
PTable
PType
of the key.
getKeyType
in interface PTable<K,V>
public PType<V> getValueType()
PTable
PType
of the value.
getValueType
in interface PTable<K,V>
public BaseGroupedTable<K,V> groupByKey()
PTable
groupByKey
in interface PTable<K,V>
PGroupedTable
instance that represents the groupingpublic BaseGroupedTable<K,V> groupByKey(int numReduceTasks)
PTable
groupByKey
in interface PTable<K,V>
numReduceTasks
- The number of partitions for the data.
PGroupedTable
instance that represents this groupingpublic BaseGroupedTable<K,V> groupByKey(GroupingOptions groupingOptions)
PTable
GroupingOptions
to control how the grouping is executed.
groupByKey
in interface PTable<K,V>
groupingOptions
- The grouping options to use
PGroupedTable
instance that represents the groupingpublic PTable<K,V> union(PTable<K,V> other)
PTable
PTable
instance that acts as the union of this
PTable
and the other PTable
s.
union
in interface PTable<K,V>
public PTable<K,V> union(PTable<K,V>... others)
PTable
PTable
instance that acts as the union of this
PTable
and the input PTable
s.
union
in interface PTable<K,V>
public PTable<K,V> write(Target target)
PCollection
PCollection
to the given Target
,
using the storage format specified by the target.
write
in interface PCollection<Pair<K,V>>
write
in interface PTable<K,V>
write
in class PCollectionImpl<Pair<K,V>>
target
- The target to write topublic PTable<K,V> write(Target target, Target.WriteMode writeMode)
PCollection
PCollection
to the given Target
,
using the given Target.WriteMode
to handle existing
targets.
write
in interface PCollection<Pair<K,V>>
write
in interface PTable<K,V>
write
in class PCollectionImpl<Pair<K,V>>
target
- The targetwriteMode
- The rule for handling existing outputs at the target locationpublic PTable<K,V> cache()
PCollection
CachingOptions
. Cached PCollection
s will only
be processed once, and then their contents will be saved so that downstream code can process them many times.
cache
in interface PCollection<Pair<K,V>>
cache
in interface PTable<K,V>
cache
in class PCollectionImpl<Pair<K,V>>
PCollection
instancepublic PTable<K,V> cache(CachingOptions options)
PCollection
CachingOptions
. Cached PCollection
s will only
be processed once and then their contents will be saved so that downstream code can process them many times.
cache
in interface PCollection<Pair<K,V>>
cache
in interface PTable<K,V>
cache
in class PCollectionImpl<Pair<K,V>>
options
- the options that control the cache settings for the data
PCollection
instancepublic PTable<K,V> filter(FilterFn<Pair<K,V>> filterFn)
PCollection
PCollection
.
filter
in interface PCollection<Pair<K,V>>
filter
in interface PTable<K,V>
filter
in class PCollectionImpl<Pair<K,V>>
public PTable<K,V> filter(String name, FilterFn<Pair<K,V>> filterFn)
PCollection
PCollection
.
filter
in interface PCollection<Pair<K,V>>
filter
in interface PTable<K,V>
filter
in class PCollectionImpl<Pair<K,V>>
name
- An identifier for this processing stepfilterFn
- The FilterFn
to applypublic <U> PTable<K,U> mapValues(MapFn<V,U> mapFn, PType<U> ptype)
PTable
PTable
that has the same keys as this instance, but
uses the given function to map the values.
mapValues
in interface PTable<K,V>
public <U> PTable<K,U> mapValues(String name, MapFn<V,U> mapFn, PType<U> ptype)
PTable
PTable
that has the same keys as this instance, but
uses the given function to map the values.
mapValues
in interface PTable<K,V>
public <K2> PTable<K2,V> mapKeys(MapFn<K,K2> mapFn, PType<K2> ptype)
PTable
PTable
that has the same values as this instance, but
uses the given function to map the keys.
mapKeys
in interface PTable<K,V>
public <K2> PTable<K2,V> mapKeys(String name, MapFn<K,K2> mapFn, PType<K2> ptype)
PTable
PTable
that has the same values as this instance, but
uses the given function to map the keys.
mapKeys
in interface PTable<K,V>
public PTable<K,V> top(int count)
PTable
top
in interface PTable<K,V>
count
- The number of pairs to returnpublic PTable<K,V> bottom(int count)
PTable
bottom
in interface PTable<K,V>
count
- The number of pairs to returnpublic PTable<K,Collection<V>> collectValues()
PTable
collectValues
in interface PTable<K,V>
public <U> PTable<K,Pair<V,U>> join(PTable<K,U> other)
PTable
join
in interface PTable<K,V>
public <U> PTable<K,Pair<Collection<V>,Collection<U>>> cogroup(PTable<K,U> other)
PTable
cogroup
in interface PTable<K,V>
public PCollection<K> keys()
PTable
PCollection
made up of the keys in this PTable.
keys
in interface PTable<K,V>
public PCollection<V> values()
PTable
PCollection
made up of the values in this PTable.
values
in interface PTable<K,V>
public Map<K,V> materializeToMap()
materializeToMap
in interface PTable<K,V>
public PObject<Map<K,V>> asMap()
PObject
encapsulating a Map
made up of the keys and values in this
PTable
.
Note:The contents of the returned map may not be exactly the same as this PTable, as a PTable is a multi-map (i.e. can contain multiple values for a single key).
asMap
in interface PTable<K,V>
PObject
encapsulating a Map
made up of the keys and values in
this PTable
.
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |