This project has retired. For details please refer to its Attic page.
Aggregate (Apache Crunch 0.8.0 API)

org.apache.crunch.lib
Class Aggregate

java.lang.Object
  extended by org.apache.crunch.lib.Aggregate

public class Aggregate
extends Object

Methods for performing various types of aggregations over PCollection instances.


Nested Class Summary
static class Aggregate.PairValueComparator<K,V>
           
static class Aggregate.TopKCombineFn<K,V>
           
static class Aggregate.TopKFn<K,V>
           
 
Constructor Summary
Aggregate()
           
 
Method Summary
static
<K,V> PTable<K,Collection<V>>
collectValues(PTable<K,V> collect)
           
static
<S> PTable<S,Long>
count(PCollection<S> collect)
          Returns a PTable that contains the unique elements of this collection mapped to a count of their occurrences.
static
<S> PTable<S,Long>
count(PCollection<S> collect, int numPartitions)
          Returns a PTable that contains the unique elements of this collection mapped to a count of their occurrences.
static
<S> PObject<Long>
length(PCollection<S> collect)
          Returns the number of elements in the provided PCollection.
static
<S> PObject<S>
max(PCollection<S> collect)
          Returns the largest numerical element from the input collection.
static
<S> PObject<S>
min(PCollection<S> collect)
          Returns the smallest numerical element from the input collection.
static
<K,V> PTable<K,V>
top(PTable<K,V> ptable, int limit, boolean maximize)
           
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

Aggregate

public Aggregate()
Method Detail

count

public static <S> PTable<S,Long> count(PCollection<S> collect)
Returns a PTable that contains the unique elements of this collection mapped to a count of their occurrences.


count

public static <S> PTable<S,Long> count(PCollection<S> collect,
                                       int numPartitions)
Returns a PTable that contains the unique elements of this collection mapped to a count of their occurrences.


length

public static <S> PObject<Long> length(PCollection<S> collect)
Returns the number of elements in the provided PCollection.

Type Parameters:
S - The type of the PCollection.
Parameters:
collect - The PCollection whose elements should be counted.
Returns:
A PObject containing the number of elements in the PCollection.

top

public static <K,V> PTable<K,V> top(PTable<K,V> ptable,
                                    int limit,
                                    boolean maximize)

max

public static <S> PObject<S> max(PCollection<S> collect)
Returns the largest numerical element from the input collection.


min

public static <S> PObject<S> min(PCollection<S> collect)
Returns the smallest numerical element from the input collection.


collectValues

public static <K,V> PTable<K,Collection<V>> collectValues(PTable<K,V> collect)


Copyright © 2013 The Apache Software Foundation. All Rights Reserved.