public abstract class BloomFilterFn<S> extends DoFn<S,Pair<String,org.apache.hadoop.util.bloom.BloomFilter>>
Modifier and Type | Field and Description |
---|---|
static String |
CRUNCH_FILTER_NAME |
static String |
CRUNCH_FILTER_SIZE |
Constructor and Description |
---|
BloomFilterFn() |
Modifier and Type | Method and Description |
---|---|
void |
cleanup(Emitter<Pair<String,org.apache.hadoop.util.bloom.BloomFilter>> emitter)
Called during the cleanup of the MapReduce job this
DoFn is
associated with. |
abstract Collection<org.apache.hadoop.util.bloom.Key> |
generateKeys(S input) |
void |
initialize()
Initialize this DoFn.
|
void |
process(S input,
Emitter<Pair<String,org.apache.hadoop.util.bloom.BloomFilter>> emitter)
Processes the records from a
PCollection . |
configure, disableDeepCopy, scaleFactor, setConfiguration, setContext
public static final String CRUNCH_FILTER_SIZE
public static final String CRUNCH_FILTER_NAME
public void initialize()
DoFn
DoFn.process(Object, Emitter)
is triggered. Subclasses may override
this method to do appropriate initialization.
Called during the setup of the job instance this DoFn
is associated
with.
initialize
in class DoFn<S,Pair<String,org.apache.hadoop.util.bloom.BloomFilter>>
public void process(S input, Emitter<Pair<String,org.apache.hadoop.util.bloom.BloomFilter>> emitter)
DoFn
PCollection
.
DoFn.process(Object, Emitter)
method call. This
functionality is imposed by Hadoop's Reducer implementation: The framework will reuse the key and value
objects that are passed into the reduce, therefore the application should
clone the objects they want to keep a copy of.public abstract Collection<org.apache.hadoop.util.bloom.Key> generateKeys(S input)
Copyright © 2017 The Apache Software Foundation. All rights reserved.