public class ExtractKeyFn<K,V> extends MapFn<V,Pair<K,V>>
MapFn<V, K>
into a
key-value pair extractor that is used to convert from a PCollection<V>
to a
PTable<K, V>
.Constructor and Description |
---|
ExtractKeyFn(MapFn<V,K> mapFn) |
Modifier and Type | Method and Description |
---|---|
void |
configure(org.apache.hadoop.conf.Configuration conf)
Configure this DoFn.
|
void |
initialize()
Initialize this DoFn.
|
Pair<K,V> |
map(V input)
Maps the given input into an instance of the output type.
|
float |
scaleFactor()
Returns an estimate of how applying this function to a
PCollection
will cause it to change in side. |
void |
setConfiguration(org.apache.hadoop.conf.Configuration conf)
Called during the setup of an initialized
PType that
relies on this instance. |
void |
setContext(org.apache.hadoop.mapreduce.TaskInputOutputContext<?,?,?,?> context)
Called during setup to pass the
TaskInputOutputContext to this
DoFn instance. |
cleanup, disableDeepCopy
public void setConfiguration(org.apache.hadoop.conf.Configuration conf)
DoFn
PType
that
relies on this instance.public void setContext(org.apache.hadoop.mapreduce.TaskInputOutputContext<?,?,?,?> context)
DoFn
TaskInputOutputContext
to this
DoFn
instance. The specified TaskInputOutputContext
must not be null.public void configure(org.apache.hadoop.conf.Configuration conf)
DoFn
Called during the job planning phase by the crunch-client.
public void initialize()
DoFn
DoFn.process(Object, Emitter)
is triggered. Subclasses may override
this method to do appropriate initialization.
Called during the setup of the job instance this DoFn
is associated
with.
public float scaleFactor()
DoFn
PCollection
will cause it to change in side. The optimizer uses these estimates to
decide where to break up dependent MR jobs into separate Map and Reduce
phases in order to minimize I/O.
Subclasses of DoFn
that will substantially alter the size of the
resulting PCollection
should override this method.
Copyright © 2016 The Apache Software Foundation. All rights reserved.