|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||
java.lang.Objectorg.apache.crunch.impl.dist.DistributedPipeline
org.apache.crunch.impl.mr.MRPipeline
public class MRPipeline
Pipeline implementation that is executed within Hadoop MapReduce.
| Constructor Summary | |
|---|---|
MRPipeline(Class<?> jarClass)
Instantiate with a default Configuration and name. |
|
MRPipeline(Class<?> jarClass,
org.apache.hadoop.conf.Configuration conf)
Instantiate with a custom configuration and default naming. |
|
MRPipeline(Class<?> jarClass,
String name)
Instantiate with a custom pipeline name. |
|
MRPipeline(Class<?> jarClass,
String name,
org.apache.hadoop.conf.Configuration conf)
Instantiate with a custom name and configuration. |
|
| Method Summary | ||
|---|---|---|
|
cache(PCollection<T> pcollection,
CachingOptions options)
Caches the given PCollection so that it will be processed at most once during pipeline execution. |
|
|
materialize(PCollection<T> pcollection)
Create the given PCollection and read the data it contains into the returned Collection instance for client use. |
|
org.apache.crunch.impl.mr.exec.MRExecutor |
plan()
|
|
PipelineResult |
run()
Constructs and executes a series of MapReduce jobs in order to write data to the output targets. |
|
MRPipelineExecution |
runAsync()
Constructs and starts a series of MapReduce jobs in order ot write data to the output targets, but returns a ListenableFuture to allow clients to control
job execution. |
|
| Methods inherited from class org.apache.crunch.impl.dist.DistributedPipeline |
|---|
cleanup, createIntermediateOutput, createTempPath, done, emptyPCollection, emptyPTable, enableDebug, getConfiguration, getFactory, getMaterializeSourceTarget, getName, getNextAnonymousStageId, read, read, readTextFile, setConfiguration, write, write, writeTextFile |
| Methods inherited from class java.lang.Object |
|---|
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Constructor Detail |
|---|
public MRPipeline(Class<?> jarClass)
jarClass - Class containing the main driver method for running the pipeline
public MRPipeline(Class<?> jarClass,
String name)
jarClass - Class containing the main driver method for running the pipelinename - Display name of the pipeline
public MRPipeline(Class<?> jarClass,
org.apache.hadoop.conf.Configuration conf)
jarClass - Class containing the main driver method for running the pipelineconf - Configuration to be used within all MapReduce jobs run in the pipeline
public MRPipeline(Class<?> jarClass,
String name,
org.apache.hadoop.conf.Configuration conf)
jarClass - Class containing the main driver method for running the pipelinename - Display name of the pipelineconf - Configuration to be used within all MapReduce jobs run in the pipeline| Method Detail |
|---|
public org.apache.crunch.impl.mr.exec.MRExecutor plan()
public PipelineResult run()
Pipeline
public MRPipelineExecution runAsync()
PipelineListenableFuture to allow clients to control
job execution.
public <T> Iterable<T> materialize(PCollection<T> pcollection)
Pipeline
pcollection - The PCollection to materialize
public <T> void cache(PCollection<T> pcollection,
CachingOptions options)
Pipeline
pcollection - The PCollection to cacheoptions - The options for how the cached data is stored
|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||