public class SecondarySort extends Object
PTable<K, Pair<V1, V2>>
collection.
Secondary sorts are usually performed during sessionization: given a collection of events, we want to group them by a key (such as a user ID), then sort the grouped records by an auxillary key (such as a timestamp), and then perform some additional processing on the sorted records.
Constructor and Description |
---|
SecondarySort() |
Modifier and Type | Method and Description |
---|---|
static <K,V1,V2,U,V> |
sortAndApply(PTable<K,Pair<V1,V2>> input,
DoFn<Pair<K,Iterable<Pair<V1,V2>>>,Pair<U,V>> doFn,
PTableType<U,V> ptype)
Perform a secondary sort on the given
PTable instance and then apply a
DoFn to the resulting sorted data to yield an output PTable<U, V> . |
static <K,V1,V2,T> |
sortAndApply(PTable<K,Pair<V1,V2>> input,
DoFn<Pair<K,Iterable<Pair<V1,V2>>>,T> doFn,
PType<T> ptype)
Perform a secondary sort on the given
PTable instance and then apply a
DoFn to the resulting sorted data to yield an output PCollection<T> . |
public static <K,V1,V2,T> PCollection<T> sortAndApply(PTable<K,Pair<V1,V2>> input, DoFn<Pair<K,Iterable<Pair<V1,V2>>>,T> doFn, PType<T> ptype)
PTable
instance and then apply a
DoFn
to the resulting sorted data to yield an output PCollection<T>
.public static <K,V1,V2,U,V> PTable<U,V> sortAndApply(PTable<K,Pair<V1,V2>> input, DoFn<Pair<K,Iterable<Pair<V1,V2>>>,Pair<U,V>> doFn, PTableType<U,V> ptype)
PTable
instance and then apply a
DoFn
to the resulting sorted data to yield an output PTable<U, V>
.Copyright © 2013 The Apache Software Foundation. All Rights Reserved.