This project has retired. For details please refer to its Attic page.
ReadableData (Apache Crunch 0.10.0 API)

org.apache.crunch
Interface ReadableData<T>

All Superinterfaces:
Serializable
All Known Implementing Classes:
DelegatingReadableData, UnionReadableData

public interface ReadableData<T>
extends Serializable

Represents the contents of a data source that can be read on the cluster from within one of the tasks running as part of a Crunch pipeline.


Method Summary
 void configure(org.apache.hadoop.conf.Configuration conf)
          Allows this instance to specify any additional configuration settings that may be needed by the job that it is launched in.
 Set<SourceTarget<?>> getSourceTargets()
           
 Iterable<T> read(org.apache.hadoop.mapreduce.TaskInputOutputContext<?,?,?,?> context)
          Read the data referenced by this instance within the given context.
 

Method Detail

getSourceTargets

Set<SourceTarget<?>> getSourceTargets()
Returns:
Any SourceTarget instances that must exist before the data in this instance can be read. Used by the planner in sequencing job processing.

configure

void configure(org.apache.hadoop.conf.Configuration conf)
Allows this instance to specify any additional configuration settings that may be needed by the job that it is launched in.

Parameters:
conf - The Configuration object for the job

read

Iterable<T> read(org.apache.hadoop.mapreduce.TaskInputOutputContext<?,?,?,?> context)
                 throws IOException
Read the data referenced by this instance within the given context.

Parameters:
context - The context of the task that is reading the data
Returns:
An iterable reference to the data in this instance
Throws:
IOException - If the data cannot be read


Copyright © 2014 The Apache Software Foundation. All Rights Reserved.