This project has retired. For details please refer to its Attic page.
SecondarySort (Apache Crunch 0.6.0 API)

org.apache.crunch.lib
Class SecondarySort

java.lang.Object
  extended by org.apache.crunch.lib.SecondarySort

public class SecondarySort
extends Object

Utilities for performing a secondary sort on a PTable<K, Pair<V1, V2>> collection.

Secondary sorts are usually performed during sessionization: given a collection of events, we want to group them by a key (such as a user ID), then sort the grouped records by an auxillary key (such as a timestamp), and then perform some additional processing on the sorted records.


Constructor Summary
SecondarySort()
           
 
Method Summary
static
<K,V1,V2,U,V>
PTable<U,V>
sortAndApply(PTable<K,Pair<V1,V2>> input, DoFn<Pair<K,Iterable<Pair<V1,V2>>>,Pair<U,V>> doFn, PTableType<U,V> ptype)
          Perform a secondary sort on the given PTable instance and then apply a DoFn to the resulting sorted data to yield an output PTable<U, V>.
static
<K,V1,V2,T>
PCollection<T>
sortAndApply(PTable<K,Pair<V1,V2>> input, DoFn<Pair<K,Iterable<Pair<V1,V2>>>,T> doFn, PType<T> ptype)
          Perform a secondary sort on the given PTable instance and then apply a DoFn to the resulting sorted data to yield an output PCollection<T>.
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

SecondarySort

public SecondarySort()
Method Detail

sortAndApply

public static <K,V1,V2,T> PCollection<T> sortAndApply(PTable<K,Pair<V1,V2>> input,
                                                      DoFn<Pair<K,Iterable<Pair<V1,V2>>>,T> doFn,
                                                      PType<T> ptype)
Perform a secondary sort on the given PTable instance and then apply a DoFn to the resulting sorted data to yield an output PCollection<T>.


sortAndApply

public static <K,V1,V2,U,V> PTable<U,V> sortAndApply(PTable<K,Pair<V1,V2>> input,
                                                     DoFn<Pair<K,Iterable<Pair<V1,V2>>>,Pair<U,V>> doFn,
                                                     PTableType<U,V> ptype)
Perform a secondary sort on the given PTable instance and then apply a DoFn to the resulting sorted data to yield an output PTable<U, V>.



Copyright © 2013 The Apache Software Foundation. All Rights Reserved.