This project has retired. For details please refer to its Attic page.
From (Apache Crunch 0.10.0 API)

org.apache.crunch.io
Class From

java.lang.Object
  extended by org.apache.crunch.io.From

public class From
extends Object

Static factory methods for creating common Source types.

The From class is intended to provide a literate API for creating Crunch pipelines from common input file types. Pipeline pipeline = new MRPipeline(this.getClass()); // Reference the lines of a text file by wrapping the TextInputFormat class. PCollection lines = pipeline.read(From.textFile("/path/to/myfiles")); // Reference entries from a sequence file where the key is a LongWritable and the // value is a custom Writable class. PTable table = pipeline.read(From.sequenceFile( "/path/to/seqfiles", LongWritable.class, MyWritable.class)); // Reference the records from an Avro file, where MyAvroObject implements Avro's // SpecificRecord interface. PCollection myObjects = pipeline.read(From.avroFile("/path/to/avrofiles", MyAvroObject.class)); // References the key-value pairs from a custom extension of FileInputFormat: PTable custom = pipeline.read(From.formattedFile( "/custom", MyFileInputFormat.class, KeyWritable.class, ValueWritable.class));


Constructor Summary
From()
           
 
Method Summary
static Source<org.apache.avro.generic.GenericData.Record> avroFile(List<org.apache.hadoop.fs.Path> paths)
          Creates a Source<GenericData.Record> by reading the schema of the Avro file at the given paths.
static
<T extends org.apache.avro.specific.SpecificRecord>
Source<T>
avroFile(List<org.apache.hadoop.fs.Path> paths, Class<T> avroClass)
          Creates a Source<T> instance from the Avro file(s) at the given Paths.
static Source<org.apache.avro.generic.GenericData.Record> avroFile(List<org.apache.hadoop.fs.Path> paths, org.apache.hadoop.conf.Configuration conf)
          Creates a Source<GenericData.Record> by reading the schema of the Avro file at the given paths using the FileSystem information contained in the given Configuration instance.
static
<T> Source<T>
avroFile(List<org.apache.hadoop.fs.Path> paths, PType<T> ptype)
          Creates a Source<T> instance from the Avro file(s) at the given Paths.
static Source<org.apache.avro.generic.GenericData.Record> avroFile(org.apache.hadoop.fs.Path path)
          Creates a Source<GenericData.Record> by reading the schema of the Avro file at the given path.
static
<T extends org.apache.avro.specific.SpecificRecord>
Source<T>
avroFile(org.apache.hadoop.fs.Path path, Class<T> avroClass)
          Creates a Source<T> instance from the Avro file(s) at the given Path.
static Source<org.apache.avro.generic.GenericData.Record> avroFile(org.apache.hadoop.fs.Path path, org.apache.hadoop.conf.Configuration conf)
          Creates a Source<GenericData.Record> by reading the schema of the Avro file at the given path using the FileSystem information contained in the given Configuration instance.
static
<T> Source<T>
avroFile(org.apache.hadoop.fs.Path path, PType<T> ptype)
          Creates a Source<T> instance from the Avro file(s) at the given Path.
static Source<org.apache.avro.generic.GenericData.Record> avroFile(String pathName)
          Creates a Source<GenericData.Record> by reading the schema of the Avro file at the given path.
static
<T extends org.apache.avro.specific.SpecificRecord>
Source<T>
avroFile(String pathName, Class<T> avroClass)
          Creates a Source<T> instance from the Avro file(s) at the given path name.
static
<T> Source<T>
avroFile(String pathName, PType<T> ptype)
          Creates a Source<T> instance from the Avro file(s) at the given path name.
static
<K,V> TableSource<K,V>
formattedFile(List<org.apache.hadoop.fs.Path> paths, Class<? extends org.apache.hadoop.mapreduce.lib.input.FileInputFormat<?,?>> formatClass, PType<K> keyType, PType<V> valueType)
          Creates a TableSource<K, V> for reading data from files that have custom FileInputFormat implementations not covered by the provided TableSource and Source factory methods.
static
<K extends org.apache.hadoop.io.Writable,V extends org.apache.hadoop.io.Writable>
TableSource<K,V>
formattedFile(List<org.apache.hadoop.fs.Path> paths, Class<? extends org.apache.hadoop.mapreduce.lib.input.FileInputFormat<K,V>> formatClass, Class<K> keyClass, Class<V> valueClass)
          Creates a TableSource<K, V> for reading data from files that have custom FileInputFormat<K, V> implementations not covered by the provided TableSource and Source factory methods.
static
<K,V> TableSource<K,V>
formattedFile(org.apache.hadoop.fs.Path path, Class<? extends org.apache.hadoop.mapreduce.lib.input.FileInputFormat<?,?>> formatClass, PType<K> keyType, PType<V> valueType)
          Creates a TableSource<K, V> for reading data from files that have custom FileInputFormat implementations not covered by the provided TableSource and Source factory methods.
static
<K extends org.apache.hadoop.io.Writable,V extends org.apache.hadoop.io.Writable>
TableSource<K,V>
formattedFile(org.apache.hadoop.fs.Path path, Class<? extends org.apache.hadoop.mapreduce.lib.input.FileInputFormat<K,V>> formatClass, Class<K> keyClass, Class<V> valueClass)
          Creates a TableSource<K, V> for reading data from files that have custom FileInputFormat<K, V> implementations not covered by the provided TableSource and Source factory methods.
static
<K,V> TableSource<K,V>
formattedFile(String pathName, Class<? extends org.apache.hadoop.mapreduce.lib.input.FileInputFormat<?,?>> formatClass, PType<K> keyType, PType<V> valueType)
          Creates a TableSource<K, V> for reading data from files that have custom FileInputFormat implementations not covered by the provided TableSource and Source factory methods.
static
<K extends org.apache.hadoop.io.Writable,V extends org.apache.hadoop.io.Writable>
TableSource<K,V>
formattedFile(String pathName, Class<? extends org.apache.hadoop.mapreduce.lib.input.FileInputFormat<K,V>> formatClass, Class<K> keyClass, Class<V> valueClass)
          Creates a TableSource<K, V> for reading data from files that have custom FileInputFormat<K, V> implementations not covered by the provided TableSource and Source factory methods.
static
<K extends org.apache.hadoop.io.Writable,V extends org.apache.hadoop.io.Writable>
TableSource<K,V>
sequenceFile(List<org.apache.hadoop.fs.Path> paths, Class<K> keyClass, Class<V> valueClass)
          Creates a TableSource<K, V> instance for the SequenceFile(s) at the given Paths.
static
<T extends org.apache.hadoop.io.Writable>
Source<T>
sequenceFile(List<org.apache.hadoop.fs.Path> paths, Class<T> valueClass)
          Creates a Source<T> instance from the SequenceFile(s) at the given Paths from the value field of each key-value pair in the SequenceFile(s).
static
<K,V> TableSource<K,V>
sequenceFile(List<org.apache.hadoop.fs.Path> paths, PType<K> keyType, PType<V> valueType)
          Creates a TableSource<K, V> instance for the SequenceFile(s) at the given Paths.
static
<T> Source<T>
sequenceFile(List<org.apache.hadoop.fs.Path> paths, PType<T> ptype)
          Creates a Source<T> instance from the SequenceFile(s) at the given Paths from the value field of each key-value pair in the SequenceFile(s).
static
<K extends org.apache.hadoop.io.Writable,V extends org.apache.hadoop.io.Writable>
TableSource<K,V>
sequenceFile(org.apache.hadoop.fs.Path path, Class<K> keyClass, Class<V> valueClass)
          Creates a TableSource<K, V> instance for the SequenceFile(s) at the given Path.
static
<T extends org.apache.hadoop.io.Writable>
Source<T>
sequenceFile(org.apache.hadoop.fs.Path path, Class<T> valueClass)
          Creates a Source<T> instance from the SequenceFile(s) at the given Path from the value field of each key-value pair in the SequenceFile(s).
static
<K,V> TableSource<K,V>
sequenceFile(org.apache.hadoop.fs.Path path, PType<K> keyType, PType<V> valueType)
          Creates a TableSource<K, V> instance for the SequenceFile(s) at the given Path.
static
<T> Source<T>
sequenceFile(org.apache.hadoop.fs.Path path, PType<T> ptype)
          Creates a Source<T> instance from the SequenceFile(s) at the given Path from the value field of each key-value pair in the SequenceFile(s).
static
<K extends org.apache.hadoop.io.Writable,V extends org.apache.hadoop.io.Writable>
TableSource<K,V>
sequenceFile(String pathName, Class<K> keyClass, Class<V> valueClass)
          Creates a TableSource<K, V> instance for the SequenceFile(s) at the given path name.
static
<T extends org.apache.hadoop.io.Writable>
Source<T>
sequenceFile(String pathName, Class<T> valueClass)
          Creates a Source<T> instance from the SequenceFile(s) at the given path name from the value field of each key-value pair in the SequenceFile(s).
static
<K,V> TableSource<K,V>
sequenceFile(String pathName, PType<K> keyType, PType<V> valueType)
          Creates a TableSource<K, V> instance for the SequenceFile(s) at the given path name.
static
<T> Source<T>
sequenceFile(String pathName, PType<T> ptype)
          Creates a Source<T> instance from the SequenceFile(s) at the given path name from the value field of each key-value pair in the SequenceFile(s).
static Source<String> textFile(List<org.apache.hadoop.fs.Path> paths)
          Creates a Source<String> instance for the text file(s) at the given Paths.
static
<T> Source<T>
textFile(List<org.apache.hadoop.fs.Path> paths, PType<T> ptype)
          Creates a Source<T> instance for the text file(s) at the given Paths using the provided PType<T> to convert the input text.
static Source<String> textFile(org.apache.hadoop.fs.Path path)
          Creates a Source<String> instance for the text file(s) at the given Path.
static
<T> Source<T>
textFile(org.apache.hadoop.fs.Path path, PType<T> ptype)
          Creates a Source<T> instance for the text file(s) at the given Path using the provided PType<T> to convert the input text.
static Source<String> textFile(String pathName)
          Creates a Source<String> instance for the text file(s) at the given path name.
static
<T> Source<T>
textFile(String pathName, PType<T> ptype)
          Creates a Source<T> instance for the text file(s) at the given path name using the provided PType<T> to convert the input text.
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

From

public From()
Method Detail

formattedFile

public static <K extends org.apache.hadoop.io.Writable,V extends org.apache.hadoop.io.Writable> TableSource<K,V> formattedFile(String pathName,
                                                                                                                               Class<? extends org.apache.hadoop.mapreduce.lib.input.FileInputFormat<K,V>> formatClass,
                                                                                                                               Class<K> keyClass,
                                                                                                                               Class<V> valueClass)
Creates a TableSource<K, V> for reading data from files that have custom FileInputFormat<K, V> implementations not covered by the provided TableSource and Source factory methods.

Parameters:
pathName - The name of the path to the data on the filesystem
formatClass - The FileInputFormat implementation
keyClass - The Writable to use for the key
valueClass - The Writable to use for the value
Returns:
A new TableSource<K, V> instance

formattedFile

public static <K extends org.apache.hadoop.io.Writable,V extends org.apache.hadoop.io.Writable> TableSource<K,V> formattedFile(org.apache.hadoop.fs.Path path,
                                                                                                                               Class<? extends org.apache.hadoop.mapreduce.lib.input.FileInputFormat<K,V>> formatClass,
                                                                                                                               Class<K> keyClass,
                                                                                                                               Class<V> valueClass)
Creates a TableSource<K, V> for reading data from files that have custom FileInputFormat<K, V> implementations not covered by the provided TableSource and Source factory methods.

Parameters:
path - The Path to the data
formatClass - The FileInputFormat implementation
keyClass - The Writable to use for the key
valueClass - The Writable to use for the value
Returns:
A new TableSource<K, V> instance

formattedFile

public static <K extends org.apache.hadoop.io.Writable,V extends org.apache.hadoop.io.Writable> TableSource<K,V> formattedFile(List<org.apache.hadoop.fs.Path> paths,
                                                                                                                               Class<? extends org.apache.hadoop.mapreduce.lib.input.FileInputFormat<K,V>> formatClass,
                                                                                                                               Class<K> keyClass,
                                                                                                                               Class<V> valueClass)
Creates a TableSource<K, V> for reading data from files that have custom FileInputFormat<K, V> implementations not covered by the provided TableSource and Source factory methods.

Parameters:
paths - A list of Paths to the data
formatClass - The FileInputFormat implementation
keyClass - The Writable to use for the key
valueClass - The Writable to use for the value
Returns:
A new TableSource<K, V> instance

formattedFile

public static <K,V> TableSource<K,V> formattedFile(String pathName,
                                                   Class<? extends org.apache.hadoop.mapreduce.lib.input.FileInputFormat<?,?>> formatClass,
                                                   PType<K> keyType,
                                                   PType<V> valueType)
Creates a TableSource<K, V> for reading data from files that have custom FileInputFormat implementations not covered by the provided TableSource and Source factory methods.

Parameters:
pathName - The name of the path to the data on the filesystem
formatClass - The FileInputFormat implementation
keyType - The PType to use for the key
valueType - The PType to use for the value
Returns:
A new TableSource<K, V> instance

formattedFile

public static <K,V> TableSource<K,V> formattedFile(org.apache.hadoop.fs.Path path,
                                                   Class<? extends org.apache.hadoop.mapreduce.lib.input.FileInputFormat<?,?>> formatClass,
                                                   PType<K> keyType,
                                                   PType<V> valueType)
Creates a TableSource<K, V> for reading data from files that have custom FileInputFormat implementations not covered by the provided TableSource and Source factory methods.

Parameters:
path - The Path to the data
formatClass - The FileInputFormat implementation
keyType - The PType to use for the key
valueType - The PType to use for the value
Returns:
A new TableSource<K, V> instance

formattedFile

public static <K,V> TableSource<K,V> formattedFile(List<org.apache.hadoop.fs.Path> paths,
                                                   Class<? extends org.apache.hadoop.mapreduce.lib.input.FileInputFormat<?,?>> formatClass,
                                                   PType<K> keyType,
                                                   PType<V> valueType)
Creates a TableSource<K, V> for reading data from files that have custom FileInputFormat implementations not covered by the provided TableSource and Source factory methods.

Parameters:
paths - A list of Paths to the data
formatClass - The FileInputFormat implementation
keyType - The PType to use for the key
valueType - The PType to use for the value
Returns:
A new TableSource<K, V> instance

avroFile

public static <T extends org.apache.avro.specific.SpecificRecord> Source<T> avroFile(String pathName,
                                                                                     Class<T> avroClass)
Creates a Source<T> instance from the Avro file(s) at the given path name.

Parameters:
pathName - The name of the path to the data on the filesystem
avroClass - The subclass of SpecificRecord to use for the Avro file
Returns:
A new Source<T> instance

avroFile

public static <T extends org.apache.avro.specific.SpecificRecord> Source<T> avroFile(org.apache.hadoop.fs.Path path,
                                                                                     Class<T> avroClass)
Creates a Source<T> instance from the Avro file(s) at the given Path.

Parameters:
path - The Path to the data
avroClass - The subclass of SpecificRecord to use for the Avro file
Returns:
A new Source<T> instance

avroFile

public static <T extends org.apache.avro.specific.SpecificRecord> Source<T> avroFile(List<org.apache.hadoop.fs.Path> paths,
                                                                                     Class<T> avroClass)
Creates a Source<T> instance from the Avro file(s) at the given Paths.

Parameters:
paths - A list of Paths to the data
avroClass - The subclass of SpecificRecord to use for the Avro file
Returns:
A new Source<T> instance

avroFile

public static <T> Source<T> avroFile(String pathName,
                                     PType<T> ptype)
Creates a Source<T> instance from the Avro file(s) at the given path name.

Parameters:
pathName - The name of the path to the data on the filesystem
ptype - The AvroType for the Avro records
Returns:
A new Source<T> instance

avroFile

public static <T> Source<T> avroFile(org.apache.hadoop.fs.Path path,
                                     PType<T> ptype)
Creates a Source<T> instance from the Avro file(s) at the given Path.

Parameters:
path - The Path to the data
ptype - The AvroType for the Avro records
Returns:
A new Source<T> instance

avroFile

public static <T> Source<T> avroFile(List<org.apache.hadoop.fs.Path> paths,
                                     PType<T> ptype)
Creates a Source<T> instance from the Avro file(s) at the given Paths.

Parameters:
paths - A list of Paths to the data
ptype - The PType for the Avro records
Returns:
A new Source<T> instance

avroFile

public static Source<org.apache.avro.generic.GenericData.Record> avroFile(String pathName)
Creates a Source<GenericData.Record> by reading the schema of the Avro file at the given path. If the path is a directory, the schema of a file in the directory will be used to determine the schema to use.

Parameters:
pathName - The name of the path to the data on the filesystem
Returns:
A new Source<GenericData.Record> instance

avroFile

public static Source<org.apache.avro.generic.GenericData.Record> avroFile(org.apache.hadoop.fs.Path path)
Creates a Source<GenericData.Record> by reading the schema of the Avro file at the given path. If the path is a directory, the schema of a file in the directory will be used to determine the schema to use.

Parameters:
path - The path to the data on the filesystem
Returns:
A new Source<GenericData.Record> instance

avroFile

public static Source<org.apache.avro.generic.GenericData.Record> avroFile(List<org.apache.hadoop.fs.Path> paths)
Creates a Source<GenericData.Record> by reading the schema of the Avro file at the given paths. If the path is a directory, the schema of a file in the directory will be used to determine the schema to use.

Parameters:
paths - A list of paths to the data on the filesystem
Returns:
A new Source<GenericData.Record> instance

avroFile

public static Source<org.apache.avro.generic.GenericData.Record> avroFile(org.apache.hadoop.fs.Path path,
                                                                          org.apache.hadoop.conf.Configuration conf)
Creates a Source<GenericData.Record> by reading the schema of the Avro file at the given path using the FileSystem information contained in the given Configuration instance. If the path is a directory, the schema of a file in the directory will be used to determine the schema to use.

Parameters:
path - The path to the data on the filesystem
conf - The configuration information
Returns:
A new Source<GenericData.Record> instance

avroFile

public static Source<org.apache.avro.generic.GenericData.Record> avroFile(List<org.apache.hadoop.fs.Path> paths,
                                                                          org.apache.hadoop.conf.Configuration conf)
Creates a Source<GenericData.Record> by reading the schema of the Avro file at the given paths using the FileSystem information contained in the given Configuration instance. If the first path is a directory, the schema of a file in the directory will be used to determine the schema to use.

Parameters:
paths - The path to the data on the filesystem
conf - The configuration information
Returns:
A new Source<GenericData.Record> instance

sequenceFile

public static <T extends org.apache.hadoop.io.Writable> Source<T> sequenceFile(String pathName,
                                                                               Class<T> valueClass)
Creates a Source<T> instance from the SequenceFile(s) at the given path name from the value field of each key-value pair in the SequenceFile(s).

Parameters:
pathName - The name of the path to the data on the filesystem
valueClass - The Writable type for the value of the SequenceFile entry
Returns:
A new Source<T> instance

sequenceFile

public static <T extends org.apache.hadoop.io.Writable> Source<T> sequenceFile(org.apache.hadoop.fs.Path path,
                                                                               Class<T> valueClass)
Creates a Source<T> instance from the SequenceFile(s) at the given Path from the value field of each key-value pair in the SequenceFile(s).

Parameters:
path - The Path to the data
valueClass - The Writable type for the value of the SequenceFile entry
Returns:
A new Source<T> instance

sequenceFile

public static <T extends org.apache.hadoop.io.Writable> Source<T> sequenceFile(List<org.apache.hadoop.fs.Path> paths,
                                                                               Class<T> valueClass)
Creates a Source<T> instance from the SequenceFile(s) at the given Paths from the value field of each key-value pair in the SequenceFile(s).

Parameters:
paths - A list of Paths to the data
valueClass - The Writable type for the value of the SequenceFile entry
Returns:
A new Source<T> instance

sequenceFile

public static <T> Source<T> sequenceFile(String pathName,
                                         PType<T> ptype)
Creates a Source<T> instance from the SequenceFile(s) at the given path name from the value field of each key-value pair in the SequenceFile(s).

Parameters:
pathName - The name of the path to the data on the filesystem
ptype - The PType for the value of the SequenceFile entry
Returns:
A new Source<T> instance

sequenceFile

public static <T> Source<T> sequenceFile(org.apache.hadoop.fs.Path path,
                                         PType<T> ptype)
Creates a Source<T> instance from the SequenceFile(s) at the given Path from the value field of each key-value pair in the SequenceFile(s).

Parameters:
path - The Path to the data
ptype - The PType for the value of the SequenceFile entry
Returns:
A new Source<T> instance

sequenceFile

public static <T> Source<T> sequenceFile(List<org.apache.hadoop.fs.Path> paths,
                                         PType<T> ptype)
Creates a Source<T> instance from the SequenceFile(s) at the given Paths from the value field of each key-value pair in the SequenceFile(s).

Parameters:
paths - A list of Paths to the data
ptype - The PType for the value of the SequenceFile entry
Returns:
A new Source<T> instance

sequenceFile

public static <K extends org.apache.hadoop.io.Writable,V extends org.apache.hadoop.io.Writable> TableSource<K,V> sequenceFile(String pathName,
                                                                                                                              Class<K> keyClass,
                                                                                                                              Class<V> valueClass)
Creates a TableSource<K, V> instance for the SequenceFile(s) at the given path name.

Parameters:
pathName - The name of the path to the data on the filesystem
keyClass - The Writable subclass for the key of the SequenceFile entry
valueClass - The Writable subclass for the value of the SequenceFile entry
Returns:
A new SourceTable<K, V> instance

sequenceFile

public static <K extends org.apache.hadoop.io.Writable,V extends org.apache.hadoop.io.Writable> TableSource<K,V> sequenceFile(org.apache.hadoop.fs.Path path,
                                                                                                                              Class<K> keyClass,
                                                                                                                              Class<V> valueClass)
Creates a TableSource<K, V> instance for the SequenceFile(s) at the given Path.

Parameters:
path - The Path to the data
keyClass - The Writable subclass for the key of the SequenceFile entry
valueClass - The Writable subclass for the value of the SequenceFile entry
Returns:
A new SourceTable<K, V> instance

sequenceFile

public static <K extends org.apache.hadoop.io.Writable,V extends org.apache.hadoop.io.Writable> TableSource<K,V> sequenceFile(List<org.apache.hadoop.fs.Path> paths,
                                                                                                                              Class<K> keyClass,
                                                                                                                              Class<V> valueClass)
Creates a TableSource<K, V> instance for the SequenceFile(s) at the given Paths.

Parameters:
paths - A list of Paths to the data
keyClass - The Writable subclass for the key of the SequenceFile entry
valueClass - The Writable subclass for the value of the SequenceFile entry
Returns:
A new SourceTable<K, V> instance

sequenceFile

public static <K,V> TableSource<K,V> sequenceFile(String pathName,
                                                  PType<K> keyType,
                                                  PType<V> valueType)
Creates a TableSource<K, V> instance for the SequenceFile(s) at the given path name.

Parameters:
pathName - The name of the path to the data on the filesystem
keyType - The PType for the key of the SequenceFile entry
valueType - The PType for the value of the SequenceFile entry
Returns:
A new SourceTable<K, V> instance

sequenceFile

public static <K,V> TableSource<K,V> sequenceFile(org.apache.hadoop.fs.Path path,
                                                  PType<K> keyType,
                                                  PType<V> valueType)
Creates a TableSource<K, V> instance for the SequenceFile(s) at the given Path.

Parameters:
path - The Path to the data
keyType - The PType for the key of the SequenceFile entry
valueType - The PType for the value of the SequenceFile entry
Returns:
A new SourceTable<K, V> instance

sequenceFile

public static <K,V> TableSource<K,V> sequenceFile(List<org.apache.hadoop.fs.Path> paths,
                                                  PType<K> keyType,
                                                  PType<V> valueType)
Creates a TableSource<K, V> instance for the SequenceFile(s) at the given Paths.

Parameters:
paths - A list of Paths to the data
keyType - The PType for the key of the SequenceFile entry
valueType - The PType for the value of the SequenceFile entry
Returns:
A new SourceTable<K, V> instance

textFile

public static Source<String> textFile(String pathName)
Creates a Source<String> instance for the text file(s) at the given path name.

Parameters:
pathName - The name of the path to the data on the filesystem
Returns:
A new Source<String> instance

textFile

public static Source<String> textFile(org.apache.hadoop.fs.Path path)
Creates a Source<String> instance for the text file(s) at the given Path.

Parameters:
path - The Path to the data
Returns:
A new Source<String> instance

textFile

public static Source<String> textFile(List<org.apache.hadoop.fs.Path> paths)
Creates a Source<String> instance for the text file(s) at the given Paths.

Parameters:
paths - A list of Paths to the data
Returns:
A new Source<String> instance

textFile

public static <T> Source<T> textFile(String pathName,
                                     PType<T> ptype)
Creates a Source<T> instance for the text file(s) at the given path name using the provided PType<T> to convert the input text.

Parameters:
pathName - The name of the path to the data on the filesystem
ptype - The PType<T> to use to process the input text
Returns:
A new Source<T> instance

textFile

public static <T> Source<T> textFile(org.apache.hadoop.fs.Path path,
                                     PType<T> ptype)
Creates a Source<T> instance for the text file(s) at the given Path using the provided PType<T> to convert the input text.

Parameters:
path - The Path to the data
ptype - The PType<T> to use to process the input text
Returns:
A new Source<T> instance

textFile

public static <T> Source<T> textFile(List<org.apache.hadoop.fs.Path> paths,
                                     PType<T> ptype)
Creates a Source<T> instance for the text file(s) at the given Paths using the provided PType<T> to convert the input text.

Parameters:
paths - A list of Paths to the data
ptype - The PType<T> to use to process the input text
Returns:
A new Source<T> instance


Copyright © 2014 The Apache Software Foundation. All Rights Reserved.