This project has retired. For details please refer to its Attic page.
Extractor (Apache Crunch 0.11.0 API)

org.apache.crunch.contrib.text
Interface Extractor<T>

Type Parameters:
T - The data type to be extracted
All Superinterfaces:
Serializable
All Known Implementing Classes:
AbstractCompositeExtractor, AbstractSimpleExtractor

public interface Extractor<T>
extends Serializable

An interface for extracting a specific data type from a text string that is being processed by a Scanner object.


Method Summary
 boolean errorOnLastRecord()
          Returns true if the last call to extract on this instance threw an exception that was handled.
 T extract(String input)
          Extract a value with the type of this instance.
 T getDefaultValue()
          Returns the default value for this Extractor in case of an error.
 PType<T> getPType(PTypeFamily ptf)
          Returns the PType associated with this data type for the given PTypeFamily.
 ExtractorStats getStats()
          Return statistics about how many errors this Extractor instance encountered while parsing input data.
 void initialize()
          Perform any initialization required by this Extractor during the start of a map or reduce task.
 

Method Detail

extract

T extract(String input)
Extract a value with the type of this instance.


getPType

PType<T> getPType(PTypeFamily ptf)
Returns the PType associated with this data type for the given PTypeFamily.


getDefaultValue

T getDefaultValue()
Returns the default value for this Extractor in case of an error.


initialize

void initialize()
Perform any initialization required by this Extractor during the start of a map or reduce task.


errorOnLastRecord

boolean errorOnLastRecord()
Returns true if the last call to extract on this instance threw an exception that was handled.


getStats

ExtractorStats getStats()
Return statistics about how many errors this Extractor instance encountered while parsing input data.



Copyright © 2014 The Apache Software Foundation. All Rights Reserved.