This project has retired. For details please refer to its
Attic page .
NLineFileSource (Apache Crunch 0.9.0 API)
org.apache.crunch.io.text
Class NLineFileSource<T>
java.lang.Object
org.apache.crunch.io.impl.FileSourceImpl <T>
org.apache.crunch.io.text.NLineFileSource<T>
All Implemented Interfaces: ReadableSource <T>, Source <T>
public class NLineFileSource<T> extends FileSourceImpl <T>implements ReadableSource <T>
A Source instance that uses the NLineInputFormat, which gives each map
task a fraction of the lines in a text file as input. Most useful when running simulations
on Hadoop, where each line represents configuration information about each simulation
run.
Constructor Summary
NLineFileSource (List <org.apache.hadoop.fs.Path> paths,
PType <T > ptype,
int linesPerTask)
Create a new NLineFileSource instance.
NLineFileSource (org.apache.hadoop.fs.Path path,
PType <T > ptype,
int linesPerTask)
Create a new NLineFileSource instance.
NLineFileSource (String path,
PType <T > ptype,
int linesPerTask)
Create a new NLineFileSource instance.
Methods inherited from class org.apache.crunch.io.impl.FileSourceImpl
configureSource , equals , getBundle , getConverter , getLastModifiedAt , getPath , getPaths , getSize , getType , hashCode , inputConf , pathsAsString , read
NLineFileSource
public NLineFileSource (String path,
PType <T > ptype,
int linesPerTask)
Create a new NLineFileSource instance.
Parameters: path - The path to the input data, as a Stringptype - The PType to use for processing the datalinesPerTask - The number of lines from the input each map task will process
NLineFileSource
public NLineFileSource (org.apache.hadoop.fs.Path path,
PType <T > ptype,
int linesPerTask)
Create a new NLineFileSource instance.
Parameters: path - The Path to the input dataptype - The PType to use for processing the datalinesPerTask - The number of lines from the input each map task will process
NLineFileSource
public NLineFileSource (List <org.apache.hadoop.fs.Path> paths,
PType <T > ptype,
int linesPerTask)
Create a new NLineFileSource instance.
Parameters: paths - The Paths to the input dataptype - The PType to use for processing the datalinesPerTask - The number of lines from the input each map task will process
toString
public String toString ()
Overrides: toString in class FileSourceImpl <T >
read
public Iterable <T > read (org.apache.hadoop.conf.Configuration conf)
throws IOException
Description copied from interface: ReadableSource
Returns an Iterable that contains the contents of this source.
Specified by: read in interface ReadableSource <T >
Parameters: conf - The current Configuration instance
Returns: the contents of this Source as an Iterable instance
Throws:
IOException
asReadable
public ReadableData <T > asReadable ()
Specified by: asReadable in interface ReadableSource <T >
Returns: a ReadableData instance containing the data referenced by this
ReadableSource.
Copyright © 2014 The Apache Software Foundation . All Rights Reserved.