This project has retired. For details please refer to its
Attic page .
NLineFileSource (Apache Crunch 0.9.0 API)
org.apache.crunch.io.text
Class NLineFileSource<T>
java.lang.Object
org.apache.crunch.io.impl.FileSourceImpl <T>
org.apache.crunch.io.text.NLineFileSource<T>
All Implemented Interfaces: ReadableSource <T>, Source <T>
public class NLineFileSource<T> extends FileSourceImpl <T>implements ReadableSource <T>
A Source
instance that uses the NLineInputFormat
, which gives each map
task a fraction of the lines in a text file as input. Most useful when running simulations
on Hadoop, where each line represents configuration information about each simulation
run.
Constructor Summary
NLineFileSource (List <org.apache.hadoop.fs.Path> paths,
PType <T > ptype,
int linesPerTask)
Create a new NLineFileSource
instance.
NLineFileSource (org.apache.hadoop.fs.Path path,
PType <T > ptype,
int linesPerTask)
Create a new NLineFileSource
instance.
NLineFileSource (String path,
PType <T > ptype,
int linesPerTask)
Create a new NLineFileSource
instance.
Methods inherited from class org.apache.crunch.io.impl.FileSourceImpl
configureSource , equals , getBundle , getConverter , getLastModifiedAt , getPath , getPaths , getSize , getType , hashCode , inputConf , pathsAsString , read
NLineFileSource
public NLineFileSource (String path,
PType <T > ptype,
int linesPerTask)
Create a new NLineFileSource
instance.
Parameters: path
- The path to the input data, as a Stringptype
- The PType to use for processing the datalinesPerTask
- The number of lines from the input each map task will process
NLineFileSource
public NLineFileSource (org.apache.hadoop.fs.Path path,
PType <T > ptype,
int linesPerTask)
Create a new NLineFileSource
instance.
Parameters: path
- The Path
to the input dataptype
- The PType to use for processing the datalinesPerTask
- The number of lines from the input each map task will process
NLineFileSource
public NLineFileSource (List <org.apache.hadoop.fs.Path> paths,
PType <T > ptype,
int linesPerTask)
Create a new NLineFileSource
instance.
Parameters: paths
- The Path
s to the input dataptype
- The PType to use for processing the datalinesPerTask
- The number of lines from the input each map task will process
toString
public String toString ()
Overrides: toString
in class FileSourceImpl <T >
read
public Iterable <T > read (org.apache.hadoop.conf.Configuration conf)
throws IOException
Description copied from interface: ReadableSource
Returns an Iterable
that contains the contents of this source.
Specified by: read
in interface ReadableSource <T >
Parameters: conf
- The current Configuration
instance
Returns: the contents of this Source
as an Iterable
instance
Throws:
IOException
asReadable
public ReadableData <T > asReadable ()
Specified by: asReadable
in interface ReadableSource <T >
Returns: a ReadableData
instance containing the data referenced by this
ReadableSource
.
Copyright © 2014 The Apache Software Foundation . All Rights Reserved.