|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||
public interface Pipeline
Manages the state of a pipeline execution.
| Method Summary | ||
|---|---|---|
PipelineResult |
done()
Run any remaining jobs required to generate outputs and then clean up any intermediate data files that were created in this run or previous calls to run. |
|
void |
enableDebug()
Turn on debug logging for jobs that are run from this pipeline. |
|
org.apache.hadoop.conf.Configuration |
getConfiguration()
Returns the Configuration instance associated with this pipeline. |
|
String |
getName()
Returns the name of this pipeline. |
|
|
materialize(PCollection<T> pcollection)
Create the given PCollection and read the data it contains into the returned Collection instance for client use. |
|
|
read(Source<T> source)
Converts the given Source into a PCollection that is
available to jobs run using this Pipeline instance. |
|
|
read(TableSource<K,V> tableSource)
A version of the read method for TableSource instances that map to
PTables. |
|
PCollection<String> |
readTextFile(String pathName)
A convenience method for reading a text file. |
|
PipelineResult |
run()
Constructs and executes a series of MapReduce jobs in order to write data to the output targets. |
|
PipelineExecution |
runAsync()
Constructs and starts a series of MapReduce jobs in order ot write data to the output targets, but returns a ListenableFuture to allow clients to control
job execution. |
|
void |
setConfiguration(org.apache.hadoop.conf.Configuration conf)
Set the Configuration to use with this pipeline. |
|
void |
write(PCollection<?> collection,
Target target)
Write the given collection to the given target on the next pipeline run. |
|
void |
write(PCollection<?> collection,
Target target,
Target.WriteMode writeMode)
Write the contents of the PCollection to the given Target,
using the storage format specified by the target and the given
WriteMode for cases where the referenced Target
already exists. |
|
|
writeTextFile(PCollection<T> collection,
String pathName)
A convenience method for writing a text file. |
|
| Method Detail |
|---|
void setConfiguration(org.apache.hadoop.conf.Configuration conf)
Configuration to use with this pipeline.
String getName()
org.apache.hadoop.conf.Configuration getConfiguration()
Configuration instance associated with this pipeline.
<T> PCollection<T> read(Source<T> source)
Source into a PCollection that is
available to jobs run using this Pipeline instance.
source - The source of data
<K,V> PTable<K,V> read(TableSource<K,V> tableSource)
TableSource instances that map to
PTables.
tableSource - The source of the data
void write(PCollection<?> collection,
Target target)
WriteMode.DEFAULT rule for the given Target.
collection - The collectiontarget - The output target
void write(PCollection<?> collection,
Target target,
Target.WriteMode writeMode)
PCollection to the given Target,
using the storage format specified by the target and the given
WriteMode for cases where the referenced Target
already exists.
collection - The collectiontarget - The target to write towriteMode - The strategy to use for handling existing outputs<T> Iterable<T> materialize(PCollection<T> pcollection)
pcollection - The PCollection to materialize
PipelineResult run()
PipelineExecution runAsync()
ListenableFuture to allow clients to control
job execution.
PipelineResult done()
run.
PCollection<String> readTextFile(String pathName)
<T> void writeTextFile(PCollection<T> collection,
String pathName)
void enableDebug()
|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||