public class CompositeMapFn<R,S,T> extends MapFn<R,T>
Constructor and Description |
---|
CompositeMapFn(MapFn<R,S> first,
MapFn<S,T> second) |
Modifier and Type | Method and Description |
---|---|
void |
cleanup(Emitter<T> emitter)
Called during the cleanup of the MapReduce job this
DoFn is
associated with. |
void |
configure(org.apache.hadoop.conf.Configuration conf)
Configure this DoFn.
|
MapFn<R,S> |
getFirst() |
MapFn<S,T> |
getSecond() |
void |
initialize()
Initialize this DoFn.
|
T |
map(R input)
Maps the given input into an instance of the output type.
|
float |
scaleFactor()
Returns an estimate of how applying this function to a
PCollection
will cause it to change in side. |
void |
setConfiguration(org.apache.hadoop.conf.Configuration conf)
Called during the setup of an initialized
PType that
relies on this instance. |
void |
setContext(org.apache.hadoop.mapreduce.TaskInputOutputContext<?,?,?,?> context)
Called during setup to pass the
TaskInputOutputContext to this
DoFn instance. |
disableDeepCopy
public void setConfiguration(org.apache.hadoop.conf.Configuration conf)
DoFn
PType
that
relies on this instance.setConfiguration
in class DoFn<R,T>
conf
- The non-null configuration for the PType
being initializedpublic void setContext(org.apache.hadoop.mapreduce.TaskInputOutputContext<?,?,?,?> context)
DoFn
TaskInputOutputContext
to this
DoFn
instance. The specified TaskInputOutputContext
must not be null.setContext
in class DoFn<R,T>
public void initialize()
DoFn
DoFn.process(Object, Emitter)
is triggered. Subclasses may override
this method to do appropriate initialization.
Called during the setup of the job instance this DoFn
is associated
with.
initialize
in class DoFn<R,T>
public T map(R input)
MapFn
public void cleanup(Emitter<T> emitter)
DoFn
DoFn
is
associated with. Subclasses may override this method to do appropriate
cleanup.public void configure(org.apache.hadoop.conf.Configuration conf)
DoFn
Called during the job planning phase by the crunch-client.
public float scaleFactor()
DoFn
PCollection
will cause it to change in side. The optimizer uses these estimates to
decide where to break up dependent MR jobs into separate Map and Reduce
phases in order to minimize I/O.
Subclasses of DoFn
that will substantially alter the size of the
resulting PCollection
should override this method.
scaleFactor
in class MapFn<R,T>
Copyright © 2016 The Apache Software Foundation. All rights reserved.