public class To extends Object
Static factory methods for creating common Target types.
The To class is intended to be used as part of a literate API
for writing the output of Crunch pipelines to common file types. We can use
the Target objects created by the factory methods in the To
class with either the write method on the Pipeline class or
the convenience write method on PCollection and PTable
instances.
Pipeline pipeline = new MRPipeline(this.getClass());
...
// Write a PCollection<String> to a text file:
PCollection<String> words = ...;
pipeline.write(words, To.textFile("/put/my/words/here"));
// Write a PTable<Text, Text> to a sequence file:
PTable<Text, Text> textToText = ...;
textToText.write(To.sequenceFile("/words/to/words"));
// Write a PCollection<MyAvroObject> to an Avro data file:
PCollection<MyAvroObject> objects = ...;
objects.write(To.avroFile("/my/avro/files"));
// Write a PTable to a custom FileOutputFormat:
PTable<KeyWritable, ValueWritable> custom = ...;
pipeline.write(custom, To.formattedFile("/custom", MyFileFormat.class));
| Constructor and Description |
|---|
To() |
| Modifier and Type | Method and Description |
|---|---|
static Target |
avroFile(org.apache.hadoop.fs.Path path)
Creates a
Target at the given Path that writes data to
Avro files. |
static Target |
avroFile(String pathName)
Creates a
Target at the given path name that writes data to
Avro files. |
static <K extends org.apache.hadoop.io.Writable,V extends org.apache.hadoop.io.Writable> |
formattedFile(org.apache.hadoop.fs.Path path,
Class<? extends org.apache.hadoop.mapreduce.lib.output.FileOutputFormat<K,V>> formatClass)
Creates a
Target at the given Path that writes data to
a custom FileOutputFormat. |
static <K extends org.apache.hadoop.io.Writable,V extends org.apache.hadoop.io.Writable> |
formattedFile(String pathName,
Class<? extends org.apache.hadoop.mapreduce.lib.output.FileOutputFormat<K,V>> formatClass)
Creates a
Target at the given path name that writes data to
a custom FileOutputFormat. |
static Target |
sequenceFile(org.apache.hadoop.fs.Path path)
Creates a
Target at the given Path that writes data to
SequenceFiles. |
static Target |
sequenceFile(String pathName)
Creates a
Target at the given path name that writes data to
SequenceFiles. |
static Target |
textFile(org.apache.hadoop.fs.Path path)
Creates a
Target at the given Path that writes data to
text files. |
static Target |
textFile(String pathName)
Creates a
Target at the given path name that writes data to
text files. |
public static <K extends org.apache.hadoop.io.Writable,V extends org.apache.hadoop.io.Writable> Target formattedFile(String pathName, Class<? extends org.apache.hadoop.mapreduce.lib.output.FileOutputFormat<K,V>> formatClass)
Target at the given path name that writes data to
a custom FileOutputFormat.pathName - The name of the path to write the data to on the filesystemformatClass - The FileOutputFormat<K, V> to write the data toTarget instancepublic static <K extends org.apache.hadoop.io.Writable,V extends org.apache.hadoop.io.Writable> Target formattedFile(org.apache.hadoop.fs.Path path, Class<? extends org.apache.hadoop.mapreduce.lib.output.FileOutputFormat<K,V>> formatClass)
Target at the given Path that writes data to
a custom FileOutputFormat.path - The Path to write the data toformatClass - The FileOutputFormat to write the data toTarget instancepublic static Target avroFile(String pathName)
Target at the given path name that writes data to
Avro files. The PType for the written data must be for Avro records.pathName - The name of the path to write the data to on the filesystemTarget instancepublic static Target avroFile(org.apache.hadoop.fs.Path path)
Target at the given Path that writes data to
Avro files. The PType for the written data must be for Avro records.path - The Path to write the data toTarget instancepublic static Target sequenceFile(String pathName)
Target at the given path name that writes data to
SequenceFiles.pathName - The name of the path to write the data to on the filesystemTarget instancepublic static Target sequenceFile(org.apache.hadoop.fs.Path path)
Target at the given Path that writes data to
SequenceFiles.path - The Path to write the data toTarget instancepublic static Target textFile(String pathName)
Target at the given path name that writes data to
text files.pathName - The name of the path to write the data to on the filesystemTarget instancepublic static Target textFile(org.apache.hadoop.fs.Path path)
Target at the given Path that writes data to
text files.path - The Path to write the data toTarget instanceCopyright © 2015 The Apache Software Foundation. All Rights Reserved.