This project has retired. For details please refer to its Attic page.
PType (Apache Crunch 0.3.0-incubating API)

org.apache.crunch.types
Interface PType<T>

All Superinterfaces:
Serializable
All Known Subinterfaces:
PTableType<K,V>
All Known Implementing Classes:
AvroGroupedTableType, AvroTableType, AvroType, PGroupedTableType, WritableGroupedTableType, WritableType

public interface PType<T>
extends Serializable

A PType defines a mapping between a data type that is used in a Crunch pipeline and a serialization and storage format that is used to read/write data from/to HDFS. Every PCollection has an associated PType that tells Crunch how to read/write data from that PCollection.


Method Summary
 Converter getConverter()
           
 SourceTarget<T> getDefaultFileSource(org.apache.hadoop.fs.Path path)
          Returns a SourceTarget that is able to read/write data using the serialization format specified by this PType.
 T getDetachedValue(T value)
          Returns a copy of a value (or the value itself) that can safely be retained.
 PTypeFamily getFamily()
          Returns the PTypeFamily that this PType belongs to.
 MapFn<Object,T> getInputMapFn()
           
 MapFn<T,Object> getOutputMapFn()
           
 List<PType> getSubTypes()
          Returns the sub-types that make up this PType if it is a composite instance, such as a tuple.
 Class<T> getTypeClass()
          Returns the Java type represented by this PType.
 

Method Detail

getTypeClass

Class<T> getTypeClass()
Returns the Java type represented by this PType.


getFamily

PTypeFamily getFamily()
Returns the PTypeFamily that this PType belongs to.


getInputMapFn

MapFn<Object,T> getInputMapFn()

getOutputMapFn

MapFn<T,Object> getOutputMapFn()

getConverter

Converter getConverter()

getDetachedValue

T getDetachedValue(T value)
Returns a copy of a value (or the value itself) that can safely be retained.

This is useful when iterable values being processed in a DoFn (via a reducer) need to be held on to for more than the scope of a single iteration, as a reducer (and therefore also a DoFn that has an Iterable as input) re-use deserialized values. More information on object reuse is available in the DoFn class documentation.

Parameters:
value - The value to be deep-copied
Returns:
A deep copy of the input value

getDefaultFileSource

SourceTarget<T> getDefaultFileSource(org.apache.hadoop.fs.Path path)
Returns a SourceTarget that is able to read/write data using the serialization format specified by this PType.


getSubTypes

List<PType> getSubTypes()
Returns the sub-types that make up this PType if it is a composite instance, such as a tuple.



Copyright © 2012 The Apache Software Foundation. All Rights Reserved.