K
- key type for this tableV
- value type for this tablepublic interface LGroupedTable<K,V> extends LCollection<Pair<K,Iterable<V>>>
PGroupedTable
interface, allowing distributed operations to be expressed in
terms of lambda expressions and method references, instead of creating a new class implementation for each operation.Modifier and Type | Method and Description |
---|---|
default LTable<K,Collection<V>> |
collectAllValues()
Collect all values for each key into a
Collection |
default LTable<K,Collection<V>> |
collectUniqueValues()
Collect all unique values for each key into a
Collection (note that the value type must have a correctly-
defined equals() and hashcode(). |
default <C> LTable<K,C> |
collectValues(SSupplier<C> emptySupplier,
SBiConsumer<C,V> addFn,
PType<C> pType)
Collect the values into an aggregate type.
|
default LTable<K,V> |
combineValues(Aggregator<V> aggregator)
Combine the value part of the table using the provided Crunch
Aggregator . |
default <A> LTable<K,V> |
combineValues(SSupplier<A> initialSupplier,
SBiFunction<A,V,A> combineFn,
SFunction<A,Iterable<V>> outputFn)
Combine the value part of the table using the given functions.
|
default PType<K> |
keyType()
Get a
PType which can be used to serialize the key part of this grouped table |
default <T> LTable<K,T> |
mapValues(SFunction<java.util.stream.Stream<V>,T> fn,
PType<T> pType)
Map the values in this LGroupedTable using a custom function.
|
default LTable<K,V> |
reduceValues(SBinaryOperator<V> operator)
Reduce the values for each key using the an associative binary operator.
|
PGroupedTable<K,V> |
underlying()
Get the underlying
PGroupedTable for this LGroupedTable |
default LTable<K,V> |
ungroup()
Ungroup this LGroupedTable back into an
LTable . |
default PType<V> |
valueType()
Get a
PType which can be used to serialize the value part of this grouped table |
by, cache, cache, count, factory, filter, filterMap, filterMap, flatMap, flatMap, increment, increment, incrementIf, incrementIf, map, map, materialize, parallelDo, parallelDo, parallelDo, parallelDo, ptf, pType, union, union, write, write
PGroupedTable<K,V> underlying()
PGroupedTable
for this LGroupedTableunderlying
in interface LCollection<Pair<K,Iterable<V>>>
default LTable<K,V> combineValues(Aggregator<V> aggregator)
Aggregator
. This will be optimised into
both a combine and reduce in the MapReduce implementation, with similar optimisations available for other
implementations.default <A> LTable<K,V> combineValues(SSupplier<A> initialSupplier, SBiFunction<A,V,A> combineFn, SFunction<A,Iterable<V>> outputFn)
myGroupedTable.combineValues(() -> 0, (sum, value) -> sum + value, Collections::singleton)
This will be optimised into both a combine and reduce in the MapReduce implementation, with similar optimizations *available for other implementations.
default <T> LTable<K,T> mapValues(SFunction<java.util.stream.Stream<V>,T> fn, PType<T> pType)
Note that in serialization systems which heavily reuse objects (such as Avro), you may in fact get given the same object multiple times with different data as you consume the stream, meaning it may be necessary to detach values.
default <C> LTable<K,C> collectValues(SSupplier<C> emptySupplier, SBiConsumer<C,V> addFn, PType<C> pType)
The supplier provides an "empty" object, then the consumer is called with each value. For example, to collect
all values into a Collection
, one can do this:
lgt.collectValues(ArrayList::new, Collection::add, lgt.ptf().collections(lgt.valueType()))
This is in fact the default implementation for the collectAllValues() method.
Note that in serialization systems which heavily reuse objects (such as Avro), you may in fact get given the same object multiple times with different data as you consume the stream, meaning it may be necessary to detach values.
default LTable<K,Collection<V>> collectAllValues()
Collection
default LTable<K,Collection<V>> collectUniqueValues()
Collection
(note that the value type must have a correctly-
defined equals() and hashcode().default LTable<K,V> reduceValues(SBinaryOperator<V> operator)
reduceValues((a, b) -> a + b)
for summation, reduceValues((a, b) -> a + ", " + b
for comma-separated string concatenation and reduceValues((a, b) -> a > b ? a : b
for maximum value.default LTable<K,V> ungroup()
LTable
. This will still trigger a "reduce" operation, so is
usually only used in special cases like producing a globally-ordered list by feeding the everything through
a single reducers.default PType<K> keyType()
PType
which can be used to serialize the key part of this grouped tableCopyright © 2016 The Apache Software Foundation. All rights reserved.