| Package | Description |
|---|---|
| org.apache.crunch |
Client-facing API and core abstractions.
|
| org.apache.crunch.contrib.bloomfilter |
Support for creating Bloom Filters.
|
| org.apache.crunch.contrib.io.jdbc |
Support for reading data from RDBMS using JDBC
|
| org.apache.crunch.contrib.text | |
| org.apache.crunch.examples |
Example applications demonstrating various aspects of Crunch.
|
| org.apache.crunch.fn |
Commonly used functions for manipulating collections.
|
| org.apache.crunch.impl.mem |
In-memory Pipeline implementation for rapid prototyping and testing.
|
| org.apache.crunch.impl.mr |
A Pipeline implementation that runs on Hadoop MapReduce.
|
| org.apache.crunch.io |
Data input and output for Pipelines.
|
| org.apache.crunch.lib |
Joining, sorting, aggregating, and other commonly used functionality.
|
| org.apache.crunch.lib.join |
Inner and outer joins on collections.
|
| org.apache.crunch.types |
Common functionality for business object serialization.
|
| org.apache.crunch.types.avro |
Business object serialization using Apache Avro.
|
| org.apache.crunch.types.writable |
Business object serialization using Hadoop's Writables framework.
|
| org.apache.crunch.util |
An assorted set of utilities.
|
| Class and Description |
|---|
| Aggregator
Aggregate a sequence of values into a possibly smaller sequence of the same type.
|
| CombineFn |
| CombineFn.Aggregator
Deprecated.
Use
Aggregator |
| CombineFn.AggregatorFactory
Deprecated.
Use
PGroupedTable.combineValues(Aggregator) which doesn't require a factory. |
| CombineFn.SimpleAggregator
Deprecated.
|
| DoFn
Base class for all data processing functions in Crunch.
|
| Emitter
Interface for writing outputs from a
DoFn. |
| FilterFn
A
DoFn for the common case of filtering the members of a
PCollection based on a boolean condition. |
| GroupingOptions
Options that can be passed to a
groupByKey operation in order to
exercise finer control over how the partitioning, grouping, and sorting of
keys is performed. |
| GroupingOptions.Builder
Builder class for creating
GroupingOptions instances. |
| MapFn
A
DoFn for the common case of emitting exactly one value for each
input record. |
| Pair
A convenience class for two-element
Tuples. |
| ParallelDoOptions
Container class that includes optional information about a
parallelDo operation
applied to a PCollection. |
| ParallelDoOptions.Builder |
| PCollection
A representation of an immutable, distributed collection of elements that is
the fundamental target of computations in Crunch.
|
| PGroupedTable
The Crunch representation of a grouped
PTable. |
| Pipeline
Manages the state of a pipeline execution.
|
| PipelineResult
Container for the results of a call to
run or done on the
Pipeline interface that includes details and statistics about the component
stages of the data pipeline. |
| PipelineResult.StageResult |
| PObject
A
PObject represents a singleton object value that results from a distributed
computation. |
| PTable
A sub-interface of
PCollection that represents an immutable,
distributed multi-map of keys and values. |
| Source
A
Source represents an input data set that is an input to one or more
MapReduce jobs. |
| SourceTarget
An interface for classes that implement both the
Source and the
Target interfaces. |
| TableSource
The interface
Source implementations that return a PTable. |
| Target
A
Target represents the output destination of a Crunch PCollection
in the context of a Crunch job. |
| Target.WriteMode
An enum to represent different options the client may specify
for handling the case where the output path, table, etc.
|
| Tuple
A fixed-size collection of Objects, used in Crunch for representing joins
between
PCollections. |
| Tuple3
A convenience class for three-element
Tuples. |
| Tuple4
A convenience class for four-element
Tuples. |
| TupleN
A
Tuple instance for an arbitrary number of values. |
| Class and Description |
|---|
| DoFn
Base class for all data processing functions in Crunch.
|
| Emitter
Interface for writing outputs from a
DoFn. |
| Pair
A convenience class for two-element
Tuples. |
| PCollection
A representation of an immutable, distributed collection of elements that is
the fundamental target of computations in Crunch.
|
| PObject
A
PObject represents a singleton object value that results from a distributed
computation. |
| Class and Description |
|---|
| Source
A
Source represents an input data set that is an input to one or more
MapReduce jobs. |
| Class and Description |
|---|
| Pair
A convenience class for two-element
Tuples. |
| PCollection
A representation of an immutable, distributed collection of elements that is
the fundamental target of computations in Crunch.
|
| PTable
A sub-interface of
PCollection that represents an immutable,
distributed multi-map of keys and values. |
| Tuple
A fixed-size collection of Objects, used in Crunch for representing joins
between
PCollections. |
| Tuple3
A convenience class for three-element
Tuples. |
| Tuple4
A convenience class for four-element
Tuples. |
| TupleN
A
Tuple instance for an arbitrary number of values. |
| Class and Description |
|---|
| PCollection
A representation of an immutable, distributed collection of elements that is
the fundamental target of computations in Crunch.
|
| PTable
A sub-interface of
PCollection that represents an immutable,
distributed multi-map of keys and values. |
| Class and Description |
|---|
| Aggregator
Aggregate a sequence of values into a possibly smaller sequence of the same type.
|
| CombineFn |
| DoFn
Base class for all data processing functions in Crunch.
|
| Emitter
Interface for writing outputs from a
DoFn. |
| FilterFn
A
DoFn for the common case of filtering the members of a
PCollection based on a boolean condition. |
| MapFn
A
DoFn for the common case of emitting exactly one value for each
input record. |
| Pair
A convenience class for two-element
Tuples. |
| Tuple3
A convenience class for three-element
Tuples. |
| Tuple4
A convenience class for four-element
Tuples. |
| TupleN
A
Tuple instance for an arbitrary number of values. |
| Class and Description |
|---|
| Pair
A convenience class for two-element
Tuples. |
| PCollection
A representation of an immutable, distributed collection of elements that is
the fundamental target of computations in Crunch.
|
| Pipeline
Manages the state of a pipeline execution.
|
| PipelineResult
Container for the results of a call to
run or done on the
Pipeline interface that includes details and statistics about the component
stages of the data pipeline. |
| PTable
A sub-interface of
PCollection that represents an immutable,
distributed multi-map of keys and values. |
| Source
A
Source represents an input data set that is an input to one or more
MapReduce jobs. |
| TableSource
The interface
Source implementations that return a PTable. |
| Target
A
Target represents the output destination of a Crunch PCollection
in the context of a Crunch job. |
| Target.WriteMode
An enum to represent different options the client may specify
for handling the case where the output path, table, etc.
|
| Class and Description |
|---|
| PCollection
A representation of an immutable, distributed collection of elements that is
the fundamental target of computations in Crunch.
|
| Pipeline
Manages the state of a pipeline execution.
|
| PipelineResult
Container for the results of a call to
run or done on the
Pipeline interface that includes details and statistics about the component
stages of the data pipeline. |
| PTable
A sub-interface of
PCollection that represents an immutable,
distributed multi-map of keys and values. |
| Source
A
Source represents an input data set that is an input to one or more
MapReduce jobs. |
| SourceTarget
An interface for classes that implement both the
Source and the
Target interfaces. |
| TableSource
The interface
Source implementations that return a PTable. |
| Target
A
Target represents the output destination of a Crunch PCollection
in the context of a Crunch job. |
| Target.WriteMode
An enum to represent different options the client may specify
for handling the case where the output path, table, etc.
|
| Class and Description |
|---|
| Source
A
Source represents an input data set that is an input to one or more
MapReduce jobs. |
| SourceTarget
An interface for classes that implement both the
Source and the
Target interfaces. |
| TableSource
The interface
Source implementations that return a PTable. |
| TableSourceTarget
An interface for classes that implement both the
TableSource and the
Target interfaces. |
| Target
A
Target represents the output destination of a Crunch PCollection
in the context of a Crunch job. |
| Class and Description |
|---|
| CombineFn |
| DoFn
Base class for all data processing functions in Crunch.
|
| Emitter
Interface for writing outputs from a
DoFn. |
| Pair
A convenience class for two-element
Tuples. |
| PCollection
A representation of an immutable, distributed collection of elements that is
the fundamental target of computations in Crunch.
|
| PObject
A
PObject represents a singleton object value that results from a distributed
computation. |
| PTable
A sub-interface of
PCollection that represents an immutable,
distributed multi-map of keys and values. |
| Tuple3
A convenience class for three-element
Tuples. |
| Tuple4
A convenience class for four-element
Tuples. |
| TupleN
A
Tuple instance for an arbitrary number of values. |
| Class and Description |
|---|
| DoFn
Base class for all data processing functions in Crunch.
|
| Emitter
Interface for writing outputs from a
DoFn. |
| Pair
A convenience class for two-element
Tuples. |
| PTable
A sub-interface of
PCollection that represents an immutable,
distributed multi-map of keys and values. |
| Class and Description |
|---|
| DoFn
Base class for all data processing functions in Crunch.
|
| GroupingOptions
Options that can be passed to a
groupByKey operation in order to
exercise finer control over how the partitioning, grouping, and sorting of
keys is performed. |
| MapFn
A
DoFn for the common case of emitting exactly one value for each
input record. |
| Pair
A convenience class for two-element
Tuples. |
| SourceTarget
An interface for classes that implement both the
Source and the
Target interfaces. |
| Tuple
A fixed-size collection of Objects, used in Crunch for representing joins
between
PCollections. |
| Tuple3
A convenience class for three-element
Tuples. |
| Tuple4
A convenience class for four-element
Tuples. |
| TupleN
A
Tuple instance for an arbitrary number of values. |
| Class and Description |
|---|
| MapFn
A
DoFn for the common case of emitting exactly one value for each
input record. |
| Pair
A convenience class for two-element
Tuples. |
| SourceTarget
An interface for classes that implement both the
Source and the
Target interfaces. |
| Tuple
A fixed-size collection of Objects, used in Crunch for representing joins
between
PCollections. |
| Tuple3
A convenience class for three-element
Tuples. |
| Tuple4
A convenience class for four-element
Tuples. |
| TupleN
A
Tuple instance for an arbitrary number of values. |
| Class and Description |
|---|
| MapFn
A
DoFn for the common case of emitting exactly one value for each
input record. |
| Pair
A convenience class for two-element
Tuples. |
| SourceTarget
An interface for classes that implement both the
Source and the
Target interfaces. |
| Tuple
A fixed-size collection of Objects, used in Crunch for representing joins
between
PCollections. |
| Tuple3
A convenience class for three-element
Tuples. |
| Tuple4
A convenience class for four-element
Tuples. |
| TupleN
A
Tuple instance for an arbitrary number of values. |
| Class and Description |
|---|
| Pair
A convenience class for two-element
Tuples. |
| PCollection
A representation of an immutable, distributed collection of elements that is
the fundamental target of computations in Crunch.
|
| PTable
A sub-interface of
PCollection that represents an immutable,
distributed multi-map of keys and values. |
| Source
A
Source represents an input data set that is an input to one or more
MapReduce jobs. |
| TableSource
The interface
Source implementations that return a PTable. |
| Target
A
Target represents the output destination of a Crunch PCollection
in the context of a Crunch job. |
| Tuple3
A convenience class for three-element
Tuples. |
| Tuple4
A convenience class for four-element
Tuples. |
| TupleN
A
Tuple instance for an arbitrary number of values. |
Copyright © 2013 The Apache Software Foundation. All Rights Reserved.