public class Aggregate extends Object
PCollection
instances.Modifier and Type | Class and Description |
---|---|
static class |
Aggregate.PairValueComparator<K,V> |
static class |
Aggregate.TopKCombineFn<K,V> |
static class |
Aggregate.TopKFn<K,V> |
Constructor and Description |
---|
Aggregate() |
Modifier and Type | Method and Description |
---|---|
static <S> PCollection<S> |
aggregate(PCollection<S> collect,
Aggregator<S> aggregator) |
static <K,V> PTable<K,Collection<V>> |
collectValues(PTable<K,V> collect) |
static <S> PTable<S,Long> |
count(PCollection<S> collect)
Returns a
PTable that contains the unique elements of this collection mapped to a count
of their occurrences. |
static <S> PTable<S,Long> |
count(PCollection<S> collect,
int numPartitions)
Returns a
PTable that contains the unique elements of this collection mapped to a count
of their occurrences. |
static <S> PObject<Long> |
length(PCollection<S> collect)
Returns the number of elements in the provided PCollection.
|
static <S> PObject<S> |
max(PCollection<S> collect)
Returns the largest numerical element from the input collection.
|
static <S> PObject<S> |
min(PCollection<S> collect)
Returns the smallest numerical element from the input collection.
|
static <K,V> PTable<K,V> |
top(PTable<K,V> ptable,
int limit,
boolean maximize)
Selects the top N pairs from the given table, with sorting being performed on the values (i.e.
|
public static <S> PTable<S,Long> count(PCollection<S> collect)
PTable
that contains the unique elements of this collection mapped to a count
of their occurrences.public static <S> PTable<S,Long> count(PCollection<S> collect, int numPartitions)
PTable
that contains the unique elements of this collection mapped to a count
of their occurrences.public static <S> PObject<Long> length(PCollection<S> collect)
S
- The type of the PCollection.collect
- The PCollection whose elements should be counted.PObject
containing the number of elements in the PCollection
.public static <K,V> PTable<K,V> top(PTable<K,V> ptable, int limit, boolean maximize)
ptable
- table containing the pairs from which the top N is to be selectedlimit
- number of top elements to selectmaximize
- if true, the maximum N values from the table will be selected, otherwise the minimal
N values will be selectedpublic static <S> PObject<S> max(PCollection<S> collect)
public static <S> PObject<S> min(PCollection<S> collect)
public static <K,V> PTable<K,Collection<V>> collectValues(PTable<K,V> collect)
public static <S> PCollection<S> aggregate(PCollection<S> collect, Aggregator<S> aggregator)
Copyright © 2016 The Apache Software Foundation. All rights reserved.