This project has retired. For details please refer to its
Attic page .
JoinFn (Apache Crunch 0.3.0-incubating API)
org.apache.crunch.lib.join
Class JoinFn<K,U,V>
java.lang.Object
org.apache.crunch.DoFn <Pair <Pair <K,Integer >,Iterable <Pair <U,V>>>,Pair <K,Pair <U,V>>>
org.apache.crunch.lib.join.JoinFn<K,U,V>
Type Parameters: K
- Type of the keys.U
- Type of the first PTable
's valuesV
- Type of the second PTable
's values
All Implemented Interfaces: Serializable
Direct Known Subclasses: FullOuterJoinFn , InnerJoinFn , LeftOuterJoinFn , RightOuterJoinFn
public abstract class JoinFn<K,U,V> extends DoFn <Pair <Pair <K,Integer >,Iterable <Pair <U,V>>>,Pair <K,Pair <U,V>>>
Represents a DoFn
for performing joins.
See Also: Serialized Form
Constructor Summary
JoinFn (PType <K > keyType,
PType <U > leftValueType)
Instantiate with the PType of the value of the left side of the join (used
for creating deep copies of values).
Method Summary
abstract String
getJoinType ()
abstract void
join (K key,
int id,
Iterable <Pair <U ,V >> pairs,
Emitter <Pair <K ,Pair <U ,V >>> emitter)
Performs the actual joining.
void
process (Pair <Pair <K ,Integer >,Iterable <Pair <U ,V >>> input,
Emitter <Pair <K ,Pair <U ,V >>> emitter)
Split up the input record to make coding a bit more manageable.
JoinFn
public JoinFn (PType <K > keyType,
PType <U > leftValueType)
Instantiate with the PType of the value of the left side of the join (used
for creating deep copies of values).
Parameters: keyType
- The PType of the value used as the key of the joinleftValueType
- The PType of the value type of the left side of the join
getJoinType
public abstract String getJoinType ()
Returns: The name of this join type (e.g. innerJoin, leftOuterJoin).
join
public abstract void join (K key,
int id,
Iterable <Pair <U ,V >> pairs,
Emitter <Pair <K ,Pair <U ,V >>> emitter)
Performs the actual joining.
Parameters: key
- The key for this grouping of values.id
- The side that this group of values is from (0 -> left, 1 ->
right).pairs
- The group of values associated with this key and id pair.emitter
- The emitter to send the output to.
process
public void process (Pair <Pair <K ,Integer >,Iterable <Pair <U ,V >>> input,
Emitter <Pair <K ,Pair <U ,V >>> emitter)
Split up the input record to make coding a bit more manageable.
Specified by: process
in class DoFn <Pair <Pair <K ,Integer >,Iterable <Pair <U ,V >>>,Pair <K ,Pair <U ,V >>>
Parameters: input
- The input record.emitter
- The emitter to send the output to.
Copyright © 2012 The Apache Software Foundation . All Rights Reserved.