This project has retired. For details please refer to its Attic page.
SecondarySort (Apache Crunch 0.4.0-incubating API)

org.apache.crunch.lib
Class SecondarySort

java.lang.Object
  extended by org.apache.crunch.lib.SecondarySort

public class SecondarySort
extends Object

Utilities for performing a secondary sort on a PTable<K, Pair<V1, V2>> collection.

Secondary sorts are usually performed during sessionization: given a collection of events, we want to group them by a key (such as a user ID), then sort the grouped records by an auxillary key (such as a timestamp), and then perform some additional processing on the sorted records.


Constructor Summary
SecondarySort()
           
 
Method Summary
static
<K,V1,V2,U,V>
PTable<U,V>
sortAndApply(PTable<K,Pair<V1,V2>> input, DoFn<Pair<K,Iterable<Pair<V1,V2>>>,Pair<U,V>> doFn, PTableType<U,V> ptype)
          Perform a secondary sort on the given PTable instance and then apply a DoFn to the resulting sorted data to yield an output PTable<U, V>.
static
<K,V1,V2,T>
PCollection<T>
sortAndApply(PTable<K,Pair<V1,V2>> input, DoFn<Pair<K,Iterable<Pair<V1,V2>>>,T> doFn, PType<T> ptype)
          Perform a secondary sort on the given PTable instance and then apply a DoFn to the resulting sorted data to yield an output PCollection<T>.
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

SecondarySort

public SecondarySort()
Method Detail

sortAndApply

public static <K,V1,V2,T> PCollection<T> sortAndApply(PTable<K,Pair<V1,V2>> input,
                                                      DoFn<Pair<K,Iterable<Pair<V1,V2>>>,T> doFn,
                                                      PType<T> ptype)
Perform a secondary sort on the given PTable instance and then apply a DoFn to the resulting sorted data to yield an output PCollection<T>.


sortAndApply

public static <K,V1,V2,U,V> PTable<U,V> sortAndApply(PTable<K,Pair<V1,V2>> input,
                                                     DoFn<Pair<K,Iterable<Pair<V1,V2>>>,Pair<U,V>> doFn,
                                                     PTableType<U,V> ptype)
Perform a secondary sort on the given PTable instance and then apply a DoFn to the resulting sorted data to yield an output PTable<U, V>.



Copyright © 2012 The Apache Software Foundation. All Rights Reserved.