| Modifier and Type | Method and Description |
|---|---|
static <T> PCollection<T> |
collectionOf(Iterable<T> collect) |
static <T> PCollection<T> |
collectionOf(T... ts) |
PipelineResult |
done()
Run any remaining jobs required to generate outputs and then clean up any
intermediate data files that were created in this run or previous calls to
run. |
void |
enableDebug()
Turn on debug logging for jobs that are run from this pipeline.
|
org.apache.hadoop.conf.Configuration |
getConfiguration()
Returns the
Configuration instance associated with this pipeline. |
static org.apache.hadoop.mapreduce.Counters |
getCounters() |
static Pipeline |
getInstance() |
String |
getName()
Returns the name of this pipeline.
|
<T> Iterable<T> |
materialize(PCollection<T> pcollection)
Create the given PCollection and read the data it contains into the
returned Collection instance for client use.
|
<T> PCollection<T> |
read(Source<T> source)
Converts the given
Source into a PCollection that is
available to jobs run using this Pipeline instance. |
<K,V> PTable<K,V> |
read(TableSource<K,V> source)
A version of the read method for
TableSource instances that map to
PTables. |
PCollection<String> |
readTextFile(String pathName)
A convenience method for reading a text file.
|
PipelineResult |
run()
Constructs and executes a series of MapReduce jobs in order to write data
to the output targets.
|
void |
setConfiguration(org.apache.hadoop.conf.Configuration conf)
Set the
Configuration to use with this pipeline. |
static <S,T> PTable<S,T> |
tableOf(Iterable<Pair<S,T>> pairs) |
static <S,T> PTable<S,T> |
tableOf(S s,
T t,
Object... more) |
static <T> PCollection<T> |
typedCollectionOf(PType<T> ptype,
Iterable<T> collect) |
static <T> PCollection<T> |
typedCollectionOf(PType<T> ptype,
T... ts) |
static <S,T> PTable<S,T> |
typedTableOf(PTableType<S,T> ptype,
Iterable<Pair<S,T>> pairs) |
static <S,T> PTable<S,T> |
typedTableOf(PTableType<S,T> ptype,
S s,
T t,
Object... more) |
void |
write(PCollection<?> collection,
Target target)
Write the given collection to the given target on the next pipeline run.
|
void |
write(PCollection<?> collection,
Target target,
Target.WriteMode writeMode)
Write the contents of the
PCollection to the given Target,
using the storage format specified by the target and the given
WriteMode for cases where the referenced Target
already exists. |
<T> void |
writeTextFile(PCollection<T> collection,
String pathName)
A convenience method for writing a text file.
|
public static org.apache.hadoop.mapreduce.Counters getCounters()
public static Pipeline getInstance()
public static <T> PCollection<T> collectionOf(T... ts)
public static <T> PCollection<T> collectionOf(Iterable<T> collect)
public static <T> PCollection<T> typedCollectionOf(PType<T> ptype, T... ts)
public static <T> PCollection<T> typedCollectionOf(PType<T> ptype, Iterable<T> collect)
public static <S,T> PTable<S,T> typedTableOf(PTableType<S,T> ptype, S s, T t, Object... more)
public static <S,T> PTable<S,T> typedTableOf(PTableType<S,T> ptype, Iterable<Pair<S,T>> pairs)
public void setConfiguration(org.apache.hadoop.conf.Configuration conf)
PipelineConfiguration to use with this pipeline.setConfiguration in interface Pipelinepublic org.apache.hadoop.conf.Configuration getConfiguration()
PipelineConfiguration instance associated with this pipeline.getConfiguration in interface Pipelinepublic <T> PCollection<T> read(Source<T> source)
PipelineSource into a PCollection that is
available to jobs run using this Pipeline instance.public <K,V> PTable<K,V> read(TableSource<K,V> source)
PipelineTableSource instances that map to
PTables.public void write(PCollection<?> collection, Target target)
PipelineWriteMode.DEFAULT rule for the given Target.public void write(PCollection<?> collection, Target target, Target.WriteMode writeMode)
PipelinePCollection to the given Target,
using the storage format specified by the target and the given
WriteMode for cases where the referenced Target
already exists.public PCollection<String> readTextFile(String pathName)
PipelinereadTextFile in interface Pipelinepublic <T> void writeTextFile(PCollection<T> collection, String pathName)
PipelinewriteTextFile in interface Pipelinepublic <T> Iterable<T> materialize(PCollection<T> pcollection)
Pipelinematerialize in interface Pipelinepcollection - The PCollection to materializepublic PipelineResult run()
Pipelinepublic PipelineResult done()
Pipelinerun.public void enableDebug()
PipelineenableDebug in interface PipelineCopyright © 2013 The Apache Software Foundation. All Rights Reserved.