This project has retired. For details please refer to its Attic page.
Uses of Package org.apache.crunch (Apache Crunch 0.7.0 API)

Uses of Package
org.apache.crunch

Packages that use org.apache.crunch
org.apache.crunch Client-facing API and core abstractions. 
org.apache.crunch.contrib.bloomfilter Support for creating Bloom Filters. 
org.apache.crunch.contrib.io.jdbc Support for reading data from RDBMS using JDBC 
org.apache.crunch.contrib.text   
org.apache.crunch.examples Example applications demonstrating various aspects of Crunch. 
org.apache.crunch.fn Commonly used functions for manipulating collections. 
org.apache.crunch.impl.mem In-memory Pipeline implementation for rapid prototyping and testing. 
org.apache.crunch.impl.mr A Pipeline implementation that runs on Hadoop MapReduce. 
org.apache.crunch.io Data input and output for Pipelines. 
org.apache.crunch.lib Joining, sorting, aggregating, and other commonly used functionality. 
org.apache.crunch.lib.join Inner and outer joins on collections. 
org.apache.crunch.lib.sort   
org.apache.crunch.types Common functionality for business object serialization. 
org.apache.crunch.types.avro Business object serialization using Apache Avro. 
org.apache.crunch.types.writable Business object serialization using Hadoop's Writables framework. 
org.apache.crunch.util An assorted set of utilities. 
 

Classes in org.apache.crunch used by org.apache.crunch
Aggregator
          Aggregate a sequence of values into a possibly smaller sequence of the same type.
CombineFn
          A special DoFn implementation that converts an Iterable of values into a single value.
CombineFn.Aggregator
          Deprecated. Use Aggregator
CombineFn.AggregatorFactory
          Deprecated. Use PGroupedTable.combineValues(Aggregator) which doesn't require a factory.
CombineFn.SimpleAggregator
          Deprecated. Use Aggregators.SimpleAggregator
DoFn
          Base class for all data processing functions in Crunch.
Emitter
          Interface for writing outputs from a DoFn.
FilterFn
          A DoFn for the common case of filtering the members of a PCollection based on a boolean condition.
GroupingOptions
          Options that can be passed to a groupByKey operation in order to exercise finer control over how the partitioning, grouping, and sorting of keys is performed.
GroupingOptions.Builder
          Builder class for creating GroupingOptions instances.
MapFn
          A DoFn for the common case of emitting exactly one value for each input record.
Pair
          A convenience class for two-element Tuples.
ParallelDoOptions
          Container class that includes optional information about a parallelDo operation applied to a PCollection.
ParallelDoOptions.Builder
           
PCollection
          A representation of an immutable, distributed collection of elements that is the fundamental target of computations in Crunch.
PGroupedTable
          The Crunch representation of a grouped PTable, which corresponds to the output of the shuffle phase of a MapReduce job.
Pipeline
          Manages the state of a pipeline execution.
PipelineExecution
          A handle to allow clients to control a Crunch pipeline as it runs.
PipelineExecution.Status
           
PipelineResult
          Container for the results of a call to run or done on the Pipeline interface that includes details and statistics about the component stages of the data pipeline.
PipelineResult.StageResult
           
PObject
          A PObject represents a singleton object value that results from a distributed computation.
PTable
          A sub-interface of PCollection that represents an immutable, distributed multi-map of keys and values.
Source
          A Source represents an input data set that is an input to one or more MapReduce jobs.
SourceTarget
          An interface for classes that implement both the Source and the Target interfaces.
TableSource
          The interface Source implementations that return a PTable.
Target
          A Target represents the output destination of a Crunch PCollection in the context of a Crunch job.
Target.WriteMode
          An enum to represent different options the client may specify for handling the case where the output path, table, etc.
Tuple
          A fixed-size collection of Objects, used in Crunch for representing joins between PCollections.
Tuple3
          A convenience class for three-element Tuples.
Tuple3.Collect
           
Tuple4
          A convenience class for four-element Tuples.
Tuple4.Collect
           
TupleN
          A Tuple instance for an arbitrary number of values.
 

Classes in org.apache.crunch used by org.apache.crunch.contrib.bloomfilter
DoFn
          Base class for all data processing functions in Crunch.
Emitter
          Interface for writing outputs from a DoFn.
Pair
          A convenience class for two-element Tuples.
PCollection
          A representation of an immutable, distributed collection of elements that is the fundamental target of computations in Crunch.
PObject
          A PObject represents a singleton object value that results from a distributed computation.
 

Classes in org.apache.crunch used by org.apache.crunch.contrib.io.jdbc
Source
          A Source represents an input data set that is an input to one or more MapReduce jobs.
 

Classes in org.apache.crunch used by org.apache.crunch.contrib.text
Pair
          A convenience class for two-element Tuples.
PCollection
          A representation of an immutable, distributed collection of elements that is the fundamental target of computations in Crunch.
PTable
          A sub-interface of PCollection that represents an immutable, distributed multi-map of keys and values.
Tuple
          A fixed-size collection of Objects, used in Crunch for representing joins between PCollections.
Tuple3
          A convenience class for three-element Tuples.
Tuple4
          A convenience class for four-element Tuples.
TupleN
          A Tuple instance for an arbitrary number of values.
 

Classes in org.apache.crunch used by org.apache.crunch.examples
PCollection
          A representation of an immutable, distributed collection of elements that is the fundamental target of computations in Crunch.
PTable
          A sub-interface of PCollection that represents an immutable, distributed multi-map of keys and values.
 

Classes in org.apache.crunch used by org.apache.crunch.fn
Aggregator
          Aggregate a sequence of values into a possibly smaller sequence of the same type.
CombineFn
          A special DoFn implementation that converts an Iterable of values into a single value.
DoFn
          Base class for all data processing functions in Crunch.
Emitter
          Interface for writing outputs from a DoFn.
FilterFn
          A DoFn for the common case of filtering the members of a PCollection based on a boolean condition.
MapFn
          A DoFn for the common case of emitting exactly one value for each input record.
Pair
          A convenience class for two-element Tuples.
Tuple3
          A convenience class for three-element Tuples.
Tuple4
          A convenience class for four-element Tuples.
TupleN
          A Tuple instance for an arbitrary number of values.
 

Classes in org.apache.crunch used by org.apache.crunch.impl.mem
Pair
          A convenience class for two-element Tuples.
PCollection
          A representation of an immutable, distributed collection of elements that is the fundamental target of computations in Crunch.
Pipeline
          Manages the state of a pipeline execution.
PipelineExecution
          A handle to allow clients to control a Crunch pipeline as it runs.
PipelineResult
          Container for the results of a call to run or done on the Pipeline interface that includes details and statistics about the component stages of the data pipeline.
PTable
          A sub-interface of PCollection that represents an immutable, distributed multi-map of keys and values.
Source
          A Source represents an input data set that is an input to one or more MapReduce jobs.
TableSource
          The interface Source implementations that return a PTable.
Target
          A Target represents the output destination of a Crunch PCollection in the context of a Crunch job.
Target.WriteMode
          An enum to represent different options the client may specify for handling the case where the output path, table, etc.
 

Classes in org.apache.crunch used by org.apache.crunch.impl.mr
PCollection
          A representation of an immutable, distributed collection of elements that is the fundamental target of computations in Crunch.
Pipeline
          Manages the state of a pipeline execution.
PipelineExecution
          A handle to allow clients to control a Crunch pipeline as it runs.
PipelineResult
          Container for the results of a call to run or done on the Pipeline interface that includes details and statistics about the component stages of the data pipeline.
PTable
          A sub-interface of PCollection that represents an immutable, distributed multi-map of keys and values.
Source
          A Source represents an input data set that is an input to one or more MapReduce jobs.
SourceTarget
          An interface for classes that implement both the Source and the Target interfaces.
TableSource
          The interface Source implementations that return a PTable.
Target
          A Target represents the output destination of a Crunch PCollection in the context of a Crunch job.
Target.WriteMode
          An enum to represent different options the client may specify for handling the case where the output path, table, etc.
 

Classes in org.apache.crunch used by org.apache.crunch.io
Source
          A Source represents an input data set that is an input to one or more MapReduce jobs.
SourceTarget
          An interface for classes that implement both the Source and the Target interfaces.
TableSource
          The interface Source implementations that return a PTable.
TableSourceTarget
          An interface for classes that implement both the TableSource and the Target interfaces.
Target
          A Target represents the output destination of a Crunch PCollection in the context of a Crunch job.
 

Classes in org.apache.crunch used by org.apache.crunch.lib
CombineFn
          A special DoFn implementation that converts an Iterable of values into a single value.
DoFn
          Base class for all data processing functions in Crunch.
Emitter
          Interface for writing outputs from a DoFn.
MapFn
          A DoFn for the common case of emitting exactly one value for each input record.
Pair
          A convenience class for two-element Tuples.
PCollection
          A representation of an immutable, distributed collection of elements that is the fundamental target of computations in Crunch.
PGroupedTable
          The Crunch representation of a grouped PTable, which corresponds to the output of the shuffle phase of a MapReduce job.
PObject
          A PObject represents a singleton object value that results from a distributed computation.
PTable
          A sub-interface of PCollection that represents an immutable, distributed multi-map of keys and values.
Tuple
          A fixed-size collection of Objects, used in Crunch for representing joins between PCollections.
Tuple3
          A convenience class for three-element Tuples.
Tuple3.Collect
           
Tuple4
          A convenience class for four-element Tuples.
Tuple4.Collect
           
TupleN
          A Tuple instance for an arbitrary number of values.
 

Classes in org.apache.crunch used by org.apache.crunch.lib.join
DoFn
          Base class for all data processing functions in Crunch.
Emitter
          Interface for writing outputs from a DoFn.
Pair
          A convenience class for two-element Tuples.
PCollection
          A representation of an immutable, distributed collection of elements that is the fundamental target of computations in Crunch.
PTable
          A sub-interface of PCollection that represents an immutable, distributed multi-map of keys and values.
 

Classes in org.apache.crunch used by org.apache.crunch.lib.sort
DoFn
          Base class for all data processing functions in Crunch.
MapFn
          A DoFn for the common case of emitting exactly one value for each input record.
Tuple
          A fixed-size collection of Objects, used in Crunch for representing joins between PCollections.
 

Classes in org.apache.crunch used by org.apache.crunch.types
DoFn
          Base class for all data processing functions in Crunch.
GroupingOptions
          Options that can be passed to a groupByKey operation in order to exercise finer control over how the partitioning, grouping, and sorting of keys is performed.
MapFn
          A DoFn for the common case of emitting exactly one value for each input record.
Pair
          A convenience class for two-element Tuples.
Tuple
          A fixed-size collection of Objects, used in Crunch for representing joins between PCollections.
Tuple3
          A convenience class for three-element Tuples.
Tuple4
          A convenience class for four-element Tuples.
TupleN
          A Tuple instance for an arbitrary number of values.
 

Classes in org.apache.crunch used by org.apache.crunch.types.avro
MapFn
          A DoFn for the common case of emitting exactly one value for each input record.
Pair
          A convenience class for two-element Tuples.
Tuple
          A fixed-size collection of Objects, used in Crunch for representing joins between PCollections.
Tuple3
          A convenience class for three-element Tuples.
Tuple4
          A convenience class for four-element Tuples.
TupleN
          A Tuple instance for an arbitrary number of values.
 

Classes in org.apache.crunch used by org.apache.crunch.types.writable
MapFn
          A DoFn for the common case of emitting exactly one value for each input record.
Pair
          A convenience class for two-element Tuples.
Tuple
          A fixed-size collection of Objects, used in Crunch for representing joins between PCollections.
Tuple3
          A convenience class for three-element Tuples.
Tuple4
          A convenience class for four-element Tuples.
TupleN
          A Tuple instance for an arbitrary number of values.
 

Classes in org.apache.crunch used by org.apache.crunch.util
Pair
          A convenience class for two-element Tuples.
PCollection
          A representation of an immutable, distributed collection of elements that is the fundamental target of computations in Crunch.
PipelineExecution
          A handle to allow clients to control a Crunch pipeline as it runs.
PipelineResult
          Container for the results of a call to run or done on the Pipeline interface that includes details and statistics about the component stages of the data pipeline.
PTable
          A sub-interface of PCollection that represents an immutable, distributed multi-map of keys and values.
Source
          A Source represents an input data set that is an input to one or more MapReduce jobs.
TableSource
          The interface Source implementations that return a PTable.
Target
          A Target represents the output destination of a Crunch PCollection in the context of a Crunch job.
Tuple3
          A convenience class for three-element Tuples.
Tuple4
          A convenience class for four-element Tuples.
TupleN
          A Tuple instance for an arbitrary number of values.
 



Copyright © 2013 The Apache Software Foundation. All Rights Reserved.