|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object org.apache.crunch.fn.Aggregators
public final class Aggregators
A collection of pre-defined Aggregator
s.
The factory methods of this class return Aggregator
instances that you can use to combine the values of a PGroupedTable
.
In most cases, they turn a multimap (multiple entries per key) into a map (one
entry per key).
Note: When using composed aggregators, like those built by the
pairAggregator()
factory method, you typically don't want to put in the same child aggregator more than once,
even if all child aggregators have the same type. In most cases, this is what you want:
PTable<K, Long> result = groupedTable.combineValues( pairAggregator(SUM_LONGS(), SUM_LONGS()) );
Nested Class Summary | |
---|---|
static class |
Aggregators.SimpleAggregator<T>
Base class for aggregators that do not require any initialization. |
Method Summary | ||
---|---|---|
static
|
FIRST_N(int n)
Return the first n values (or fewer if there are fewer values than n ). |
|
static
|
LAST_N(int n)
Return the last n values (or fewer if there are fewer values than n ). |
|
static Aggregator<BigInteger> |
MAX_BIGINTS()
Return the maximum of all given BigInteger values. |
|
static Aggregator<BigInteger> |
MAX_BIGINTS(int n)
Return the n largest BigInteger values (or fewer if there are fewer
values than n ). |
|
static Aggregator<Double> |
MAX_DOUBLES()
Return the maximum of all given double values. |
|
static Aggregator<Double> |
MAX_DOUBLES(int n)
Return the n largest double values (or fewer if there are fewer
values than n ). |
|
static Aggregator<Float> |
MAX_FLOATS()
Return the maximum of all given float values. |
|
static Aggregator<Float> |
MAX_FLOATS(int n)
Return the n largest float values (or fewer if there are fewer
values than n ). |
|
static Aggregator<Integer> |
MAX_INTS()
Return the maximum of all given int values. |
|
static Aggregator<Integer> |
MAX_INTS(int n)
Return the n largest int values (or fewer if there are fewer
values than n ). |
|
static Aggregator<Long> |
MAX_LONGS()
Return the maximum of all given long values. |
|
static Aggregator<Long> |
MAX_LONGS(int n)
Return the n largest long values (or fewer if there are fewer
values than n ). |
|
static
|
MAX_N(int n,
Class<V> cls)
Return the n largest values (or fewer if there are fewer
values than n ). |
|
static Aggregator<BigInteger> |
MIN_BIGINTS()
Return the minimum of all given BigInteger values. |
|
static Aggregator<BigInteger> |
MIN_BIGINTS(int n)
Return the n smallest BigInteger values (or fewer if there are fewer
values than n ). |
|
static Aggregator<Double> |
MIN_DOUBLES()
Return the minimum of all given double values. |
|
static Aggregator<Double> |
MIN_DOUBLES(int n)
Return the n smallest double values (or fewer if there are fewer
values than n ). |
|
static Aggregator<Float> |
MIN_FLOATS()
Return the minimum of all given float values. |
|
static Aggregator<Float> |
MIN_FLOATS(int n)
Return the n smallest float values (or fewer if there are fewer
values than n ). |
|
static Aggregator<Integer> |
MIN_INTS()
Return the minimum of all given int values. |
|
static Aggregator<Integer> |
MIN_INTS(int n)
Return the n smallest int values (or fewer if there are fewer
values than n ). |
|
static Aggregator<Long> |
MIN_LONGS()
Return the minimum of all given long values. |
|
static Aggregator<Long> |
MIN_LONGS(int n)
Return the n smallest long values (or fewer if there are fewer
values than n ). |
|
static
|
MIN_N(int n,
Class<V> cls)
Return the n smallest values (or fewer if there are fewer
values than n ). |
|
static
|
pairAggregator(Aggregator<V1> a1,
Aggregator<V2> a2)
Apply separate aggregators to each component of a Pair . |
|
static
|
quadAggregator(Aggregator<V1> a1,
Aggregator<V2> a2,
Aggregator<V3> a3,
Aggregator<V4> a4)
Apply separate aggregators to each component of a Tuple4 . |
|
static
|
SAMPLE_UNIQUE_ELEMENTS(int maximumSampleSize)
Collect a sample of unique elements from the input, where 'unique' is defined by the equals method for the input objects. |
|
static Aggregator<String> |
STRING_CONCAT(String separator,
boolean skipNull)
Concatenate strings, with a separator between strings. |
|
static Aggregator<String> |
STRING_CONCAT(String separator,
boolean skipNull,
long maxOutputLength,
long maxInputLength)
Concatenate strings, with a separator between strings. |
|
static Aggregator<BigInteger> |
SUM_BIGINTS()
Sum up all BigInteger values. |
|
static Aggregator<Double> |
SUM_DOUBLES()
Sum up all double values. |
|
static Aggregator<Float> |
SUM_FLOATS()
Sum up all float values. |
|
static Aggregator<Integer> |
SUM_INTS()
Sum up all int values. |
|
static Aggregator<Long> |
SUM_LONGS()
Sum up all long values. |
|
static
|
toCombineFn(Aggregator<V> aggregator)
Wrap a CombineFn adapter around the given aggregator. |
|
static
|
tripAggregator(Aggregator<V1> a1,
Aggregator<V2> a2,
Aggregator<V3> a3)
Apply separate aggregators to each component of a Tuple3 . |
|
static Aggregator<TupleN> |
tupleAggregator(Aggregator<?>... aggregators)
Apply separate aggregators to each component of a Tuple . |
|
static
|
UNIQUE_ELEMENTS()
Collect the unique elements of the input, as defined by the equals method for
the input objects. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Method Detail |
---|
public static Aggregator<Long> SUM_LONGS()
long
values.
public static Aggregator<Integer> SUM_INTS()
int
values.
public static Aggregator<Float> SUM_FLOATS()
float
values.
public static Aggregator<Double> SUM_DOUBLES()
double
values.
public static Aggregator<BigInteger> SUM_BIGINTS()
BigInteger
values.
public static Aggregator<Long> MAX_LONGS()
long
values.
public static Aggregator<Long> MAX_LONGS(int n)
n
largest long
values (or fewer if there are fewer
values than n
).
n
- The number of values to return
public static Aggregator<Integer> MAX_INTS()
int
values.
public static Aggregator<Integer> MAX_INTS(int n)
n
largest int
values (or fewer if there are fewer
values than n
).
n
- The number of values to return
public static Aggregator<Float> MAX_FLOATS()
float
values.
public static Aggregator<Float> MAX_FLOATS(int n)
n
largest float
values (or fewer if there are fewer
values than n
).
n
- The number of values to return
public static Aggregator<Double> MAX_DOUBLES()
double
values.
public static Aggregator<Double> MAX_DOUBLES(int n)
n
largest double
values (or fewer if there are fewer
values than n
).
n
- The number of values to return
public static Aggregator<BigInteger> MAX_BIGINTS()
BigInteger
values.
public static Aggregator<BigInteger> MAX_BIGINTS(int n)
n
largest BigInteger
values (or fewer if there are fewer
values than n
).
n
- The number of values to return
public static <V extends Comparable<V>> Aggregator<V> MAX_N(int n, Class<V> cls)
n
largest values (or fewer if there are fewer
values than n
).
n
- The number of values to returncls
- The type of the values to aggregate (must implement Comparable
!)
public static Aggregator<Long> MIN_LONGS()
long
values.
public static Aggregator<Long> MIN_LONGS(int n)
n
smallest long
values (or fewer if there are fewer
values than n
).
n
- The number of values to return
public static Aggregator<Integer> MIN_INTS()
int
values.
public static Aggregator<Integer> MIN_INTS(int n)
n
smallest int
values (or fewer if there are fewer
values than n
).
n
- The number of values to return
public static Aggregator<Float> MIN_FLOATS()
float
values.
public static Aggregator<Float> MIN_FLOATS(int n)
n
smallest float
values (or fewer if there are fewer
values than n
).
n
- The number of values to return
public static Aggregator<Double> MIN_DOUBLES()
double
values.
public static Aggregator<Double> MIN_DOUBLES(int n)
n
smallest double
values (or fewer if there are fewer
values than n
).
n
- The number of values to return
public static Aggregator<BigInteger> MIN_BIGINTS()
BigInteger
values.
public static Aggregator<BigInteger> MIN_BIGINTS(int n)
n
smallest BigInteger
values (or fewer if there are fewer
values than n
).
n
- The number of values to return
public static <V extends Comparable<V>> Aggregator<V> MIN_N(int n, Class<V> cls)
n
smallest values (or fewer if there are fewer
values than n
).
n
- The number of values to returncls
- The type of the values to aggregate (must implement Comparable
!)
public static <V> Aggregator<V> FIRST_N(int n)
n
values (or fewer if there are fewer values than n
).
n
- The number of values to return
public static <V> Aggregator<V> LAST_N(int n)
n
values (or fewer if there are fewer values than n
).
n
- The number of values to return
public static Aggregator<String> STRING_CONCAT(String separator, boolean skipNull)
Note: String concatenation is not commutative, which means the result of the aggregation is not deterministic!
separator
- the separator which will be appended between each stringskipNull
- define if we should skip null values. Throw
NullPointerException if set to false and there is a null
value.
public static Aggregator<String> STRING_CONCAT(String separator, boolean skipNull, long maxOutputLength, long maxInputLength)
Any too large string (or any string which would made the output too large) will be silently discarded.
Note: String concatenation is not commutative, which means the result of the aggregation is not deterministic!
separator
- the separator which will be appended between each stringskipNull
- define if we should skip null values. Throw
NullPointerException if set to false and there is a null
value.maxOutputLength
- the maximum length of the output string. If it's set <= 0,
there is no limit. The number of characters of the output
string will be < maxOutputLength.maxInputLength
- the maximum length of the input strings. If it's set <= 0,
there is no limit. The number of characters of the input string
will be < maxInputLength to be concatenated.
public static <V> Aggregator<V> UNIQUE_ELEMENTS()
equals
method for
the input objects. No guarantees are made about the order in which the final elements
will be returned.
public static <V> Aggregator<V> SAMPLE_UNIQUE_ELEMENTS(int maximumSampleSize)
equals
method for the input objects. No guarantees are made about which
elements will be returned, simply that there will not be any more than the given sample
size for any key.
maximumSampleSize
- The maximum number of unique elements to return per key
public static <V1,V2> Aggregator<Pair<V1,V2>> pairAggregator(Aggregator<V1> a1, Aggregator<V2> a2)
Pair
.
public static <V1,V2,V3> Aggregator<Tuple3<V1,V2,V3>> tripAggregator(Aggregator<V1> a1, Aggregator<V2> a2, Aggregator<V3> a3)
Tuple3
.
public static <V1,V2,V3,V4> Aggregator<Tuple4<V1,V2,V3,V4>> quadAggregator(Aggregator<V1> a1, Aggregator<V2> a2, Aggregator<V3> a3, Aggregator<V4> a4)
Tuple4
.
public static Aggregator<TupleN> tupleAggregator(Aggregator<?>... aggregators)
Tuple
.
public static final <K,V> CombineFn<K,V> toCombineFn(Aggregator<V> aggregator)
CombineFn
adapter around the given aggregator.
aggregator
- The instance to wrap
CombineFn
delegating to aggregator
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |