org.apache.crunch
Interface Source<T>

All Known Subinterfaces:
ReadableSource<T>, ReadableSourceTarget<T>, SourceTarget<T>, TableSource<K,V>
All Known Implementing Classes:
AvroFileSource, AvroFileSourceTarget, FileSourceImpl, FileTableSourceImpl, HBaseSourceTarget, ReadableSourcePathTargetImpl, ReadableSourceTargetImpl, SeqFileSource, SeqFileSourceTarget, SeqFileTableSource, SeqFileTableSourceTarget, SourcePathTargetImpl, SourceTargetImpl, TableSourcePathTargetImpl, TableSourceTargetImpl, TextFileSource, TextFileSourceTarget

public interface Source<T>

A Source represents an input data set that is an input to one or more MapReduce jobs.


Method Summary
 void configureSource(org.apache.hadoop.mapreduce.Job job, int inputId)
          Configure the given job to use this source as an input.
 long getSize(org.apache.hadoop.conf.Configuration configuration)
          Returns the number of bytes in this Source.
 PType<T> getType()
          Returns the PType for this source.
 

Method Detail

getType

PType<T> getType()
Returns the PType for this source.


configureSource

void configureSource(org.apache.hadoop.mapreduce.Job job,
                     int inputId)
                     throws IOException
Configure the given job to use this source as an input.

Parameters:
job - The job to configure
inputId - For a multi-input job, an identifier for this input to the job
Throws:
IOException

getSize

long getSize(org.apache.hadoop.conf.Configuration configuration)
Returns the number of bytes in this Source.



Copyright © 2012 The Apache Software Foundation. All Rights Reserved.