|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object org.apache.crunch.lib.join.MapsideJoinStrategy<K,U,V>
public class MapsideJoinStrategy<K,U,V>
Utility for doing map side joins on a common key between two PTable
s.
A map side join is an optimized join which doesn't use a reducer; instead, one side of the join is loaded into memory and the join is performed in a mapper. This style of join has the important implication that the output of the join is not sorted, which is the case with a conventional (reducer-based) join.
Instances of this class should be instantiated via thecreate()
or create(boolean)
factory
methods, or optionally via the deprecated public constructor for backwards compatibility with
older versions of Crunch where the right-side table was loaded into memory. The public constructor will be removed
in a future release.
Constructor Summary | |
---|---|
MapsideJoinStrategy()
Deprecated. Use the create() factory method instead |
|
MapsideJoinStrategy(boolean materialize)
Deprecated. Use the create(boolean) factory method instead |
Method Summary | ||
---|---|---|
static
|
create()
Create a new MapsideJoinStrategy instance that will load its left-side table into memory,
and will materialize the contents of the left-side table to disk before running the in-memory join. |
|
static
|
create(boolean materialize)
Create a new MapsideJoinStrategy instance that will load its left-side table into memory. |
|
PTable<K,Pair<U,V>> |
join(PTable<K,U> left,
PTable<K,V> right,
JoinType joinType)
Join two tables with the given join type. |
Methods inherited from class java.lang.Object |
---|
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
---|
@Deprecated public MapsideJoinStrategy()
create()
factory method instead
MapsideJoinStratey
, materializing the right-side
join table to disk before the join is performed.
@Deprecated public MapsideJoinStrategy(boolean materialize)
create(boolean)
factory method instead
MapsideJoinStrategy
. If the materialize
argument is true, then the right-side join PTable
will be materialized to disk
before the in-memory join is performed. If it is false, then Crunch can optionally read
and process the data from the right-side table without having to run a job to materialize
the data to disk first.
materialize
- Whether or not to materialize the right-side table before the joinMethod Detail |
---|
public static <K,U,V> MapsideJoinStrategy<K,U,V> create()
MapsideJoinStrategy
instance that will load its left-side table into memory,
and will materialize the contents of the left-side table to disk before running the in-memory join.
The smaller of the two tables to be joined should be provided as the left-side table of the created join
strategy instance.
public static <K,U,V> MapsideJoinStrategy<K,U,V> create(boolean materialize)
MapsideJoinStrategy
instance that will load its left-side table into memory.
If the materialize
parameter is true, then the left-side PTable
will be materialized to disk
before the in-memory join is performed. If it is false, then Crunch can optionally read and process the data
from the left-side table without having to run a job to materialize the data to disk first.
materialize
- Whether or not to materialize the left-side table before the joinpublic PTable<K,Pair<U,V>> join(PTable<K,U> left, PTable<K,V> right, JoinType joinType)
JoinStrategy
join
in interface JoinStrategy<K,U,V>
left
- left table to be joinedright
- right table to be joinedjoinType
- type of join to perform
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |