|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object org.apache.crunch.io.At
public class At
Static factory methods for creating common SourceTarget
types, which may be treated as both a Source
and a Target
.
The At
methods is analogous to the From
and To
factory methods, but is used for
storing intermediate outputs that need to be passed from one run of a MapReduce pipeline to another run. The
SourceTarget
object acts as both a Source
and a , which enables it to provide this
functionality.
Pipeline pipeline = new MRPipeline(this.getClass());
// Create our intermediate storage location
SourceTarget
The SourceTarget
abstraction is useful when we care about reading the intermediate
outputs of a pipeline as well as the final results.
Constructor Summary | |
---|---|
At()
|
Method Summary | ||
---|---|---|
static SourceTarget<org.apache.avro.generic.GenericData.Record> |
avroFile(org.apache.hadoop.fs.Path path)
Creates a SourceTarget<GenericData.Record> by reading the schema of the Avro file
at the given path. |
|
static
|
avroFile(org.apache.hadoop.fs.Path path,
Class<T> avroClass)
Creates a SourceTarget<T> instance from the Avro file(s) at the given Path . |
|
static SourceTarget<org.apache.avro.generic.GenericData.Record> |
avroFile(org.apache.hadoop.fs.Path path,
org.apache.hadoop.conf.Configuration conf)
Creates a SourceTarget<GenericData.Record> by reading the schema of the Avro file
at the given path using the FileSystem information contained in the given
Configuration instance. |
|
static
|
avroFile(org.apache.hadoop.fs.Path path,
PType<T> ptype)
Creates a SourceTarget<T> instance from the Avro file(s) at the given Path . |
|
static SourceTarget<org.apache.avro.generic.GenericData.Record> |
avroFile(String pathName)
Creates a SourceTarget<GenericData.Record> by reading the schema of the Avro file
at the given path. |
|
static
|
avroFile(String pathName,
Class<T> avroClass)
Creates a SourceTarget<T> instance from the Avro file(s) at the given path name. |
|
static
|
avroFile(String pathName,
PType<T> ptype)
Creates a SourceTarget<T> instance from the Avro file(s) at the given path name. |
|
static
|
sequenceFile(org.apache.hadoop.fs.Path path,
Class<K> keyClass,
Class<V> valueClass)
Creates a TableSourceTarget<K, V> instance from the SequenceFile(s) at the given Path
from the key-value pairs in the SequenceFile(s). |
|
static
|
sequenceFile(org.apache.hadoop.fs.Path path,
Class<T> valueClass)
Creates a SourceTarget<T> instance from the SequenceFile(s) at the given Path
from the value field of each key-value pair in the SequenceFile(s). |
|
static
|
sequenceFile(org.apache.hadoop.fs.Path path,
PType<K> keyType,
PType<V> valueType)
Creates a TableSourceTarget<K, V> instance from the SequenceFile(s) at the given Path
from the key-value pairs in the SequenceFile(s). |
|
static
|
sequenceFile(org.apache.hadoop.fs.Path path,
PType<T> ptype)
Creates a SourceTarget<T> instance from the SequenceFile(s) at the given Path
from the value field of each key-value pair in the SequenceFile(s). |
|
static
|
sequenceFile(String pathName,
Class<K> keyClass,
Class<V> valueClass)
Creates a TableSourceTarget<K, V> instance from the SequenceFile(s) at the given path name
from the key-value pairs in the SequenceFile(s). |
|
static
|
sequenceFile(String pathName,
Class<T> valueClass)
Creates a SourceTarget<T> instance from the SequenceFile(s) at the given path name
from the value field of each key-value pair in the SequenceFile(s). |
|
static
|
sequenceFile(String pathName,
PType<K> keyType,
PType<V> valueType)
Creates a TableSourceTarget<K, V> instance from the SequenceFile(s) at the given path name
from the key-value pairs in the SequenceFile(s). |
|
static
|
sequenceFile(String pathName,
PType<T> ptype)
Creates a SourceTarget<T> instance from the SequenceFile(s) at the given path name
from the value field of each key-value pair in the SequenceFile(s). |
|
static SourceTarget<String> |
textFile(org.apache.hadoop.fs.Path path)
Creates a SourceTarget<String> instance for the text file(s) at the given Path . |
|
static
|
textFile(org.apache.hadoop.fs.Path path,
PType<T> ptype)
Creates a SourceTarget<T> instance for the text file(s) at the given Path using
the provided PType<T> to convert the input text. |
|
static SourceTarget<String> |
textFile(String pathName)
Creates a SourceTarget<String> instance for the text file(s) at the given path name. |
|
static
|
textFile(String pathName,
PType<T> ptype)
Creates a SourceTarget<T> instance for the text file(s) at the given path name using
the provided PType<T> to convert the input text. |
Methods inherited from class java.lang.Object |
---|
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
---|
public At()
Method Detail |
---|
public static <T extends org.apache.avro.specific.SpecificRecord> SourceTarget<T> avroFile(String pathName, Class<T> avroClass)
SourceTarget<T>
instance from the Avro file(s) at the given path name.
pathName
- The name of the path to the data on the filesystemavroClass
- The subclass of SpecificRecord
to use for the Avro file
SourceTarget<T>
instancepublic static <T extends org.apache.avro.specific.SpecificRecord> SourceTarget<T> avroFile(org.apache.hadoop.fs.Path path, Class<T> avroClass)
SourceTarget<T>
instance from the Avro file(s) at the given Path
.
path
- The Path
to the dataavroClass
- The subclass of SpecificRecord
to use for the Avro file
SourceTarget<T>
instancepublic static SourceTarget<org.apache.avro.generic.GenericData.Record> avroFile(String pathName)
SourceTarget<GenericData.Record>
by reading the schema of the Avro file
at the given path. If the path is a directory, the schema of a file in the directory
will be used to determine the schema to use.
pathName
- The name of the path to the data on the filesystem
SourceTarget<GenericData.Record>
instancepublic static SourceTarget<org.apache.avro.generic.GenericData.Record> avroFile(org.apache.hadoop.fs.Path path)
SourceTarget<GenericData.Record>
by reading the schema of the Avro file
at the given path. If the path is a directory, the schema of a file in the directory
will be used to determine the schema to use.
path
- The path to the data on the filesystem
SourceTarget<GenericData.Record>
instancepublic static SourceTarget<org.apache.avro.generic.GenericData.Record> avroFile(org.apache.hadoop.fs.Path path, org.apache.hadoop.conf.Configuration conf)
SourceTarget<GenericData.Record>
by reading the schema of the Avro file
at the given path using the FileSystem
information contained in the given
Configuration
instance. If the path is a directory, the schema of a file in
the directory will be used to determine the schema to use.
path
- The path to the data on the filesystemconf
- The configuration information
SourceTarget<GenericData.Record>
instancepublic static <T> SourceTarget<T> avroFile(String pathName, PType<T> ptype)
SourceTarget<T>
instance from the Avro file(s) at the given path name.
pathName
- The name of the path to the data on the filesystemptype
- The PType
for the Avro records
SourceTarget<T>
instancepublic static <T> SourceTarget<T> avroFile(org.apache.hadoop.fs.Path path, PType<T> ptype)
SourceTarget<T>
instance from the Avro file(s) at the given Path
.
path
- The Path
to the dataptype
- The PType
for the Avro records
SourceTarget<T>
instancepublic static <T extends org.apache.hadoop.io.Writable> SourceTarget<T> sequenceFile(String pathName, Class<T> valueClass)
SourceTarget<T>
instance from the SequenceFile(s) at the given path name
from the value field of each key-value pair in the SequenceFile(s).
pathName
- The name of the path to the data on the filesystemvalueClass
- The Writable
type for the value of the SequenceFile entry
SourceTarget<T>
instancepublic static <T extends org.apache.hadoop.io.Writable> SourceTarget<T> sequenceFile(org.apache.hadoop.fs.Path path, Class<T> valueClass)
SourceTarget<T>
instance from the SequenceFile(s) at the given Path
from the value field of each key-value pair in the SequenceFile(s).
path
- The Path
to the datavalueClass
- The Writable
type for the value of the SequenceFile entry
SourceTarget<T>
instancepublic static <T> SourceTarget<T> sequenceFile(String pathName, PType<T> ptype)
SourceTarget<T>
instance from the SequenceFile(s) at the given path name
from the value field of each key-value pair in the SequenceFile(s).
pathName
- The name of the path to the data on the filesystemptype
- The PType
for the value of the SequenceFile entry
SourceTarget<T>
instancepublic static <T> SourceTarget<T> sequenceFile(org.apache.hadoop.fs.Path path, PType<T> ptype)
SourceTarget<T>
instance from the SequenceFile(s) at the given Path
from the value field of each key-value pair in the SequenceFile(s).
path
- The Path
to the dataptype
- The PType
for the value of the SequenceFile entry
SourceTarget<T>
instancepublic static <K extends org.apache.hadoop.io.Writable,V extends org.apache.hadoop.io.Writable> TableSourceTarget<K,V> sequenceFile(String pathName, Class<K> keyClass, Class<V> valueClass)
TableSourceTarget<K, V>
instance from the SequenceFile(s) at the given path name
from the key-value pairs in the SequenceFile(s).
pathName
- The name of the path to the data on the filesystemkeyClass
- The Writable
type for the key of the SequenceFile entryvalueClass
- The Writable
type for the value of the SequenceFile entry
TableSourceTarget<K, V>
instancepublic static <K extends org.apache.hadoop.io.Writable,V extends org.apache.hadoop.io.Writable> TableSourceTarget<K,V> sequenceFile(org.apache.hadoop.fs.Path path, Class<K> keyClass, Class<V> valueClass)
TableSourceTarget<K, V>
instance from the SequenceFile(s) at the given Path
from the key-value pairs in the SequenceFile(s).
path
- The Path
to the datakeyClass
- The Writable
type for the key of the SequenceFile entryvalueClass
- The Writable
type for the value of the SequenceFile entry
TableSourceTarget<K, V>
instancepublic static <K,V> TableSourceTarget<K,V> sequenceFile(String pathName, PType<K> keyType, PType<V> valueType)
TableSourceTarget<K, V>
instance from the SequenceFile(s) at the given path name
from the key-value pairs in the SequenceFile(s).
pathName
- The name of the path to the data on the filesystemkeyType
- The PType
for the key of the SequenceFile entryvalueType
- The PType
for the value of the SequenceFile entry
TableSourceTarget<K, V>
instancepublic static <K,V> TableSourceTarget<K,V> sequenceFile(org.apache.hadoop.fs.Path path, PType<K> keyType, PType<V> valueType)
TableSourceTarget<K, V>
instance from the SequenceFile(s) at the given Path
from the key-value pairs in the SequenceFile(s).
path
- The Path
to the datakeyType
- The PType
for the key of the SequenceFile entryvalueType
- The PType
for the value of the SequenceFile entry
TableSourceTarget<K, V>
instancepublic static SourceTarget<String> textFile(String pathName)
SourceTarget<String>
instance for the text file(s) at the given path name.
pathName
- The name of the path to the data on the filesystem
SourceTarget<String>
instancepublic static SourceTarget<String> textFile(org.apache.hadoop.fs.Path path)
SourceTarget<String>
instance for the text file(s) at the given Path
.
path
- The Path
to the data
SourceTarget<String>
instancepublic static <T> SourceTarget<T> textFile(String pathName, PType<T> ptype)
SourceTarget<T>
instance for the text file(s) at the given path name using
the provided PType<T>
to convert the input text.
pathName
- The name of the path to the data on the filesystemptype
- The PType<T>
to use to process the input text
SourceTarget<T>
instancepublic static <T> SourceTarget<T> textFile(org.apache.hadoop.fs.Path path, PType<T> ptype)
SourceTarget<T>
instance for the text file(s) at the given Path
using
the provided PType<T>
to convert the input text.
path
- The Path
to the dataptype
- The PType<T>
to use to process the input text
SourceTarget<T>
instance
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |