This project has retired. For details please refer to its Attic page.
AbstractSimpleExtractor (Apache Crunch 0.9.0 API)

org.apache.crunch.contrib.text
Class AbstractSimpleExtractor<T>

java.lang.Object
  extended by org.apache.crunch.contrib.text.AbstractSimpleExtractor<T>
All Implemented Interfaces:
Serializable, Extractor<T>

public abstract class AbstractSimpleExtractor<T>
extends Object
implements Extractor<T>

Base class for the common case Extractor instances that construct a single object from a block of text stored in a String, with support for error handling and reporting.

See Also:
Serialized Form

Constructor Summary
protected AbstractSimpleExtractor(T defaultValue)
           
protected AbstractSimpleExtractor(T defaultValue, TokenizerFactory scannerFactory)
           
 
Method Summary
protected abstract  T doExtract(Tokenizer tokenizer)
          Subclasses must override this method to return a new instance of the class that this Extractor instance is designed to parse.
 boolean errorOnLastRecord()
          Returns true if the last call to extract on this instance threw an exception that was handled.
 T extract(String input)
          Extract a value with the type of this instance.
 T getDefaultValue()
          Returns the default value for this Extractor in case of an error.
 ExtractorStats getStats()
          Return statistics about how many errors this Extractor instance encountered while parsing input data.
 void initialize()
          Perform any initialization required by this Extractor during the start of a map or reduce task.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface org.apache.crunch.contrib.text.Extractor
getPType
 

Constructor Detail

AbstractSimpleExtractor

protected AbstractSimpleExtractor(T defaultValue)

AbstractSimpleExtractor

protected AbstractSimpleExtractor(T defaultValue,
                                  TokenizerFactory scannerFactory)
Method Detail

initialize

public void initialize()
Description copied from interface: Extractor
Perform any initialization required by this Extractor during the start of a map or reduce task.

Specified by:
initialize in interface Extractor<T>

extract

public T extract(String input)
Description copied from interface: Extractor
Extract a value with the type of this instance.

Specified by:
extract in interface Extractor<T>

errorOnLastRecord

public boolean errorOnLastRecord()
Description copied from interface: Extractor
Returns true if the last call to extract on this instance threw an exception that was handled.

Specified by:
errorOnLastRecord in interface Extractor<T>

getDefaultValue

public T getDefaultValue()
Description copied from interface: Extractor
Returns the default value for this Extractor in case of an error.

Specified by:
getDefaultValue in interface Extractor<T>

getStats

public ExtractorStats getStats()
Description copied from interface: Extractor
Return statistics about how many errors this Extractor instance encountered while parsing input data.

Specified by:
getStats in interface Extractor<T>

doExtract

protected abstract T doExtract(Tokenizer tokenizer)
Subclasses must override this method to return a new instance of the class that this Extractor instance is designed to parse.

Any runtime parsing exceptions from the given Tokenizer instance should be thrown so that they may be caught by the error handling logic inside of this class.

Parameters:
tokenizer - The Tokenizer instance for the current record
Returns:
A new instance of the type defined for this class


Copyright © 2014 The Apache Software Foundation. All Rights Reserved.