This project has retired. For details please refer to its Attic page.
CombineFn (Apache Crunch 0.3.0-incubating API)

org.apache.crunch
Class CombineFn<S,T>

java.lang.Object
  extended by org.apache.crunch.DoFn<Pair<S,Iterable<T>>,Pair<S,T>>
      extended by org.apache.crunch.CombineFn<S,T>
All Implemented Interfaces:
Serializable
Direct Known Subclasses:
Aggregate.TopKCombineFn, CombineFn.AggregatorCombineFn

public abstract class CombineFn<S,T>
extends DoFn<Pair<S,Iterable<T>>,Pair<S,T>>

A special DoFn implementation that converts an Iterable of values into a single value. If a CombineFn instance is used on a PGroupedTable, the function will be applied to the output of the map stage before the data is passed to the reducer, which can improve the runtime of certain classes of jobs.

See Also:
Serialized Form

Nested Class Summary
static interface CombineFn.Aggregator<T>
           
static class CombineFn.AggregatorCombineFn<K,V>
          A CombineFn that delegates all of the actual work to an Aggregator instance.
static interface CombineFn.AggregatorFactory<T>
          Interface for constructing new aggregator instances.
static class CombineFn.FirstNAggregator<V>
           
static class CombineFn.LastNAggregator<V>
           
static class CombineFn.MaxBigInts
           
static class CombineFn.MaxDoubles
           
static class CombineFn.MaxFloats
           
static class CombineFn.MaxInts
           
static class CombineFn.MaxLongs
           
static class CombineFn.MaxNAggregator<V extends Comparable<V>>
           
static class CombineFn.MinBigInts
           
static class CombineFn.MinDoubles
           
static class CombineFn.MinFloats
           
static class CombineFn.MinInts
           
static class CombineFn.MinLongs
           
static class CombineFn.MinNAggregator<V extends Comparable<V>>
           
static class CombineFn.PairAggregator<V1,V2>
           
static class CombineFn.QuadAggregator<A,B,C,D>
           
static class CombineFn.StringConcatAggregator
           
static class CombineFn.SumBigInts
           
static class CombineFn.SumDoubles
           
static class CombineFn.SumFloats
           
static class CombineFn.SumInts
           
static class CombineFn.SumLongs
           
static class CombineFn.TripAggregator<A,B,C>
           
static class CombineFn.TupleNAggregator
           
 
Field Summary
static CombineFn.AggregatorFactory<BigInteger> MAX_BIGINTS
           
static CombineFn.AggregatorFactory<Double> MAX_DOUBLES
           
static CombineFn.AggregatorFactory<Float> MAX_FLOATS
           
static CombineFn.AggregatorFactory<Integer> MAX_INTS
           
static CombineFn.AggregatorFactory<Long> MAX_LONGS
           
static CombineFn.AggregatorFactory<BigInteger> MIN_BIGINTS
           
static CombineFn.AggregatorFactory<Double> MIN_DOUBLES
           
static CombineFn.AggregatorFactory<Float> MIN_FLOATS
           
static CombineFn.AggregatorFactory<Integer> MIN_INTS
           
static CombineFn.AggregatorFactory<Long> MIN_LONGS
           
static CombineFn.AggregatorFactory<BigInteger> SUM_BIGINTS
           
static CombineFn.AggregatorFactory<Double> SUM_DOUBLES
           
static CombineFn.AggregatorFactory<Float> SUM_FLOATS
           
static CombineFn.AggregatorFactory<Integer> SUM_INTS
           
static CombineFn.AggregatorFactory<Long> SUM_LONGS
           
 
Constructor Summary
CombineFn()
           
 
Method Summary
static
<K,V> CombineFn<K,V>
aggregator(CombineFn.Aggregator<V> aggregator)
           
static
<K,V> CombineFn<K,V>
aggregatorFactory(CombineFn.AggregatorFactory<V> aggregator)
           
static
<K,V> CombineFn<K,V>
FIRST_N(int n)
           
static
<K,V> CombineFn<K,V>
LAST_N(int n)
           
static
<K> CombineFn<K,BigInteger>
MAX_BIGINTS()
           
static
<K> CombineFn<K,BigInteger>
MAX_BIGINTS(int n)
           
static
<K> CombineFn<K,Double>
MAX_DOUBLES()
           
static
<K> CombineFn<K,Double>
MAX_DOUBLES(int n)
           
static
<K> CombineFn<K,Float>
MAX_FLOATS()
           
static
<K> CombineFn<K,Float>
MAX_FLOATS(int n)
           
static
<K> CombineFn<K,Integer>
MAX_INTS()
           
static
<K> CombineFn<K,Integer>
MAX_INTS(int n)
           
static
<K> CombineFn<K,Long>
MAX_LONGS()
           
static
<K> CombineFn<K,Long>
MAX_LONGS(int n)
           
static
<K> CombineFn<K,BigInteger>
MIN_BIGINTS()
           
static
<K> CombineFn<K,BigInteger>
MIN_BIGINTS(int n)
           
static
<K> CombineFn<K,Double>
MIN_DOUBLES()
           
static
<K> CombineFn<K,Double>
MIN_DOUBLES(int n)
           
static
<K> CombineFn<K,Float>
MIN_FLOATS()
           
static
<K> CombineFn<K,Float>
MIN_FLOATS(int n)
           
static
<K> CombineFn<K,Integer>
MIN_INTS()
           
static
<K> CombineFn<K,Integer>
MIN_INTS(int n)
           
static
<K> CombineFn<K,Long>
MIN_LONGS()
           
static
<K> CombineFn<K,Long>
MIN_LONGS(int n)
           
static
<K,V1,V2> CombineFn<K,Pair<V1,V2>>
pairAggregator(CombineFn.AggregatorFactory<V1> a1, CombineFn.AggregatorFactory<V2> a2)
           
static
<K,A,B,C,D>
CombineFn<K,Tuple4<A,B,C,D>>
quadAggregator(CombineFn.AggregatorFactory<A> a1, CombineFn.AggregatorFactory<B> a2, CombineFn.AggregatorFactory<C> a3, CombineFn.AggregatorFactory<D> a4)
           
static
<K> CombineFn<K,String>
STRING_CONCAT(String separator, boolean skipNull)
          Used to concatenate strings, with a separator between each strings.
static
<K> CombineFn<K,String>
STRING_CONCAT(String separator, boolean skipNull, long maxOutputLength, long maxInputLength)
          Used to concatenate strings, with a separator between each strings.
static
<K> CombineFn<K,BigInteger>
SUM_BIGINTS()
           
static
<K> CombineFn<K,Double>
SUM_DOUBLES()
           
static
<K> CombineFn<K,Float>
SUM_FLOATS()
           
static
<K> CombineFn<K,Integer>
SUM_INTS()
           
static
<K> CombineFn<K,Long>
SUM_LONGS()
           
static
<K,A,B,C> CombineFn<K,Tuple3<A,B,C>>
tripAggregator(CombineFn.AggregatorFactory<A> a1, CombineFn.AggregatorFactory<B> a2, CombineFn.AggregatorFactory<C> a3)
           
static
<K> CombineFn<K,TupleN>
tupleAggregator(CombineFn.AggregatorFactory<?>... factories)
           
 
Methods inherited from class org.apache.crunch.DoFn
cleanup, configure, initialize, process, scaleFactor, setConfigurationForTest, setContext
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

SUM_LONGS

public static CombineFn.AggregatorFactory<Long> SUM_LONGS

SUM_INTS

public static CombineFn.AggregatorFactory<Integer> SUM_INTS

SUM_FLOATS

public static CombineFn.AggregatorFactory<Float> SUM_FLOATS

SUM_DOUBLES

public static CombineFn.AggregatorFactory<Double> SUM_DOUBLES

SUM_BIGINTS

public static CombineFn.AggregatorFactory<BigInteger> SUM_BIGINTS

MAX_LONGS

public static CombineFn.AggregatorFactory<Long> MAX_LONGS

MAX_INTS

public static CombineFn.AggregatorFactory<Integer> MAX_INTS

MAX_FLOATS

public static CombineFn.AggregatorFactory<Float> MAX_FLOATS

MAX_DOUBLES

public static CombineFn.AggregatorFactory<Double> MAX_DOUBLES

MAX_BIGINTS

public static CombineFn.AggregatorFactory<BigInteger> MAX_BIGINTS

MIN_LONGS

public static CombineFn.AggregatorFactory<Long> MIN_LONGS

MIN_INTS

public static CombineFn.AggregatorFactory<Integer> MIN_INTS

MIN_FLOATS

public static CombineFn.AggregatorFactory<Float> MIN_FLOATS

MIN_DOUBLES

public static CombineFn.AggregatorFactory<Double> MIN_DOUBLES

MIN_BIGINTS

public static CombineFn.AggregatorFactory<BigInteger> MIN_BIGINTS
Constructor Detail

CombineFn

public CombineFn()
Method Detail

aggregator

public static final <K,V> CombineFn<K,V> aggregator(CombineFn.Aggregator<V> aggregator)

aggregatorFactory

public static final <K,V> CombineFn<K,V> aggregatorFactory(CombineFn.AggregatorFactory<V> aggregator)

pairAggregator

public static final <K,V1,V2> CombineFn<K,Pair<V1,V2>> pairAggregator(CombineFn.AggregatorFactory<V1> a1,
                                                                      CombineFn.AggregatorFactory<V2> a2)

tripAggregator

public static final <K,A,B,C> CombineFn<K,Tuple3<A,B,C>> tripAggregator(CombineFn.AggregatorFactory<A> a1,
                                                                        CombineFn.AggregatorFactory<B> a2,
                                                                        CombineFn.AggregatorFactory<C> a3)

quadAggregator

public static final <K,A,B,C,D> CombineFn<K,Tuple4<A,B,C,D>> quadAggregator(CombineFn.AggregatorFactory<A> a1,
                                                                            CombineFn.AggregatorFactory<B> a2,
                                                                            CombineFn.AggregatorFactory<C> a3,
                                                                            CombineFn.AggregatorFactory<D> a4)

tupleAggregator

public static final <K> CombineFn<K,TupleN> tupleAggregator(CombineFn.AggregatorFactory<?>... factories)

SUM_LONGS

public static final <K> CombineFn<K,Long> SUM_LONGS()

SUM_INTS

public static final <K> CombineFn<K,Integer> SUM_INTS()

SUM_FLOATS

public static final <K> CombineFn<K,Float> SUM_FLOATS()

SUM_DOUBLES

public static final <K> CombineFn<K,Double> SUM_DOUBLES()

SUM_BIGINTS

public static final <K> CombineFn<K,BigInteger> SUM_BIGINTS()

MAX_LONGS

public static final <K> CombineFn<K,Long> MAX_LONGS()

MAX_LONGS

public static final <K> CombineFn<K,Long> MAX_LONGS(int n)

MAX_INTS

public static final <K> CombineFn<K,Integer> MAX_INTS()

MAX_INTS

public static final <K> CombineFn<K,Integer> MAX_INTS(int n)

MAX_FLOATS

public static final <K> CombineFn<K,Float> MAX_FLOATS()

MAX_FLOATS

public static final <K> CombineFn<K,Float> MAX_FLOATS(int n)

MAX_DOUBLES

public static final <K> CombineFn<K,Double> MAX_DOUBLES()

MAX_DOUBLES

public static final <K> CombineFn<K,Double> MAX_DOUBLES(int n)

MAX_BIGINTS

public static final <K> CombineFn<K,BigInteger> MAX_BIGINTS()

MAX_BIGINTS

public static final <K> CombineFn<K,BigInteger> MAX_BIGINTS(int n)

MIN_LONGS

public static final <K> CombineFn<K,Long> MIN_LONGS()

MIN_LONGS

public static final <K> CombineFn<K,Long> MIN_LONGS(int n)

MIN_INTS

public static final <K> CombineFn<K,Integer> MIN_INTS()

MIN_INTS

public static final <K> CombineFn<K,Integer> MIN_INTS(int n)

MIN_FLOATS

public static final <K> CombineFn<K,Float> MIN_FLOATS()

MIN_FLOATS

public static final <K> CombineFn<K,Float> MIN_FLOATS(int n)

MIN_DOUBLES

public static final <K> CombineFn<K,Double> MIN_DOUBLES()

MIN_DOUBLES

public static final <K> CombineFn<K,Double> MIN_DOUBLES(int n)

MIN_BIGINTS

public static final <K> CombineFn<K,BigInteger> MIN_BIGINTS()

MIN_BIGINTS

public static final <K> CombineFn<K,BigInteger> MIN_BIGINTS(int n)

FIRST_N

public static final <K,V> CombineFn<K,V> FIRST_N(int n)

LAST_N

public static final <K,V> CombineFn<K,V> LAST_N(int n)

STRING_CONCAT

public static final <K> CombineFn<K,String> STRING_CONCAT(String separator,
                                                          boolean skipNull)
Used to concatenate strings, with a separator between each strings. There is no limits of length for the concatenated string.

Parameters:
separator - the separator which will be appended between each string
skipNull - define if we should skip null values. Throw NullPointerException if set to false and there is a null value.
Returns:

STRING_CONCAT

public static final <K> CombineFn<K,String> STRING_CONCAT(String separator,
                                                          boolean skipNull,
                                                          long maxOutputLength,
                                                          long maxInputLength)
Used to concatenate strings, with a separator between each strings. You can specify the maximum length of the output string and of the input strings, if they are > 0. If a value is <= 0, there is no limits. Any too large string (or any string which would made the output too large) will be silently discarded.

Parameters:
separator - the separator which will be appended between each string
skipNull - define if we should skip null values. Throw NullPointerException if set to false and there is a null value.
maxOutputLength - the maximum length of the output string. If it's set <= 0, there is no limits. The number of characters of the output string will be < maxOutputLength.
maxInputLength - the maximum length of the input strings. If it's set <= 0, there is no limits. The number of characters of the int string will be < maxInputLength to be concatenated.
Returns:


Copyright © 2012 The Apache Software Foundation. All Rights Reserved.