Package | Description |
---|---|
org.apache.crunch.contrib.text |
Class and Description |
---|
Extractor
An interface for extracting a specific data type from a text string that
is being processed by a
Scanner object. |
ExtractorStats
Records the number of kind of errors that an
Extractor encountered when parsing
input data. |
Tokenizer
Manages a
Scanner instance and provides support for returning only a subset
of the fields returned by the underlying Scanner . |
TokenizerFactory
Factory class that constructs
Tokenizer instances for input strings that use a fixed
set of delimiters, skip patterns, locales, and sets of indices to keep or drop. |
TokenizerFactory.Builder
A class for constructing new
TokenizerFactory instances using the Builder pattern. |
Copyright © 2016 The Apache Software Foundation. All rights reserved.