|
|||||||||
PREV NEXT | FRAMES NO FRAMES |
Packages that use org.apache.crunch | |
---|---|
org.apache.crunch | Client-facing API and core abstractions. |
org.apache.crunch.contrib.bloomfilter | Support for creating Bloom Filters. |
org.apache.crunch.contrib.io.jdbc | Support for reading data from RDBMS using JDBC |
org.apache.crunch.contrib.text | |
org.apache.crunch.examples | Example applications demonstrating various aspects of Crunch. |
org.apache.crunch.fn | Commonly used functions for manipulating collections. |
org.apache.crunch.impl.dist | |
org.apache.crunch.impl.dist.collect | |
org.apache.crunch.impl.mem | In-memory Pipeline implementation for rapid prototyping and testing. |
org.apache.crunch.impl.mr | A Pipeline implementation that runs on Hadoop MapReduce. |
org.apache.crunch.impl.spark | |
org.apache.crunch.impl.spark.collect | |
org.apache.crunch.impl.spark.fn | |
org.apache.crunch.io | Data input and output for Pipelines. |
org.apache.crunch.io.impl | |
org.apache.crunch.lib | Joining, sorting, aggregating, and other commonly used functionality. |
org.apache.crunch.lib.join | Inner and outer joins on collections. |
org.apache.crunch.lib.sort | |
org.apache.crunch.types | Common functionality for business object serialization. |
org.apache.crunch.types.avro | Business object serialization using Apache Avro. |
org.apache.crunch.types.writable | Business object serialization using Hadoop's Writables framework. |
org.apache.crunch.util | An assorted set of utilities. |
Classes in org.apache.crunch used by org.apache.crunch | |
---|---|
Aggregator
Aggregate a sequence of values into a possibly smaller sequence of the same type. |
|
CachingOptions
Options for controlling how a PCollection<T> is cached for subsequent processing. |
|
CachingOptions.Builder
A Builder class to use for setting the CachingOptions for a PCollection . |
|
CombineFn
A special DoFn implementation that converts an Iterable of
values into a single value. |
|
DoFn
Base class for all data processing functions in Crunch. |
|
Emitter
Interface for writing outputs from a DoFn . |
|
FilterFn
A DoFn for the common case of filtering the members of a
PCollection based on a boolean condition. |
|
GroupingOptions
Options that can be passed to a groupByKey operation in order to
exercise finer control over how the partitioning, grouping, and sorting of
keys is performed. |
|
GroupingOptions.Builder
Builder class for creating GroupingOptions instances. |
|
MapFn
A DoFn for the common case of emitting exactly one value for each
input record. |
|
Pair
A convenience class for two-element Tuple s. |
|
ParallelDoOptions
Container class that includes optional information about a parallelDo operation
applied to a PCollection . |
|
ParallelDoOptions.Builder
|
|
PCollection
A representation of an immutable, distributed collection of elements that is the fundamental target of computations in Crunch. |
|
PGroupedTable
The Crunch representation of a grouped PTable , which corresponds to the output of
the shuffle phase of a MapReduce job. |
|
Pipeline
Manages the state of a pipeline execution. |
|
PipelineExecution
A handle to allow clients to control a Crunch pipeline as it runs. |
|
PipelineExecution.Status
|
|
PipelineResult
Container for the results of a call to run or done on the
Pipeline interface that includes details and statistics about the component
stages of the data pipeline. |
|
PipelineResult.StageResult
|
|
PObject
A PObject represents a singleton object value that results from a distributed
computation. |
|
PTable
A sub-interface of PCollection that represents an immutable,
distributed multi-map of keys and values. |
|
ReadableData
Represents the contents of a data source that can be read on the cluster from within one of the tasks running as part of a Crunch pipeline. |
|
Source
A Source represents an input data set that is an input to one or more
MapReduce jobs. |
|
SourceTarget
An interface for classes that implement both the Source and the
Target interfaces. |
|
TableSource
The interface Source implementations that return a PTable . |
|
Target
A Target represents the output destination of a Crunch PCollection
in the context of a Crunch job. |
|
Target.WriteMode
An enum to represent different options the client may specify for handling the case where the output path, table, etc. |
|
Tuple
A fixed-size collection of Objects, used in Crunch for representing joins between PCollection s. |
|
Tuple3
A convenience class for three-element Tuple s. |
|
Tuple3.Collect
|
|
Tuple4
A convenience class for four-element Tuple s. |
|
Tuple4.Collect
|
|
TupleN
A Tuple instance for an arbitrary number of values. |
Classes in org.apache.crunch used by org.apache.crunch.contrib.bloomfilter | |
---|---|
DoFn
Base class for all data processing functions in Crunch. |
|
Emitter
Interface for writing outputs from a DoFn . |
|
Pair
A convenience class for two-element Tuple s. |
|
PCollection
A representation of an immutable, distributed collection of elements that is the fundamental target of computations in Crunch. |
|
PObject
A PObject represents a singleton object value that results from a distributed
computation. |
Classes in org.apache.crunch used by org.apache.crunch.contrib.io.jdbc | |
---|---|
Source
A Source represents an input data set that is an input to one or more
MapReduce jobs. |
Classes in org.apache.crunch used by org.apache.crunch.contrib.text | |
---|---|
Pair
A convenience class for two-element Tuple s. |
|
PCollection
A representation of an immutable, distributed collection of elements that is the fundamental target of computations in Crunch. |
|
PTable
A sub-interface of PCollection that represents an immutable,
distributed multi-map of keys and values. |
|
Tuple
A fixed-size collection of Objects, used in Crunch for representing joins between PCollection s. |
|
Tuple3
A convenience class for three-element Tuple s. |
|
Tuple4
A convenience class for four-element Tuple s. |
|
TupleN
A Tuple instance for an arbitrary number of values. |
Classes in org.apache.crunch used by org.apache.crunch.examples | |
---|---|
PCollection
A representation of an immutable, distributed collection of elements that is the fundamental target of computations in Crunch. |
|
PTable
A sub-interface of PCollection that represents an immutable,
distributed multi-map of keys and values. |
Classes in org.apache.crunch used by org.apache.crunch.fn | |
---|---|
Aggregator
Aggregate a sequence of values into a possibly smaller sequence of the same type. |
|
CombineFn
A special DoFn implementation that converts an Iterable of
values into a single value. |
|
DoFn
Base class for all data processing functions in Crunch. |
|
Emitter
Interface for writing outputs from a DoFn . |
|
FilterFn
A DoFn for the common case of filtering the members of a
PCollection based on a boolean condition. |
|
MapFn
A DoFn for the common case of emitting exactly one value for each
input record. |
|
Pair
A convenience class for two-element Tuple s. |
|
Tuple3
A convenience class for three-element Tuple s. |
|
Tuple4
A convenience class for four-element Tuple s. |
|
TupleN
A Tuple instance for an arbitrary number of values. |
Classes in org.apache.crunch used by org.apache.crunch.impl.dist | |
---|---|
PCollection
A representation of an immutable, distributed collection of elements that is the fundamental target of computations in Crunch. |
|
Pipeline
Manages the state of a pipeline execution. |
|
PipelineResult
Container for the results of a call to run or done on the
Pipeline interface that includes details and statistics about the component
stages of the data pipeline. |
|
PTable
A sub-interface of PCollection that represents an immutable,
distributed multi-map of keys and values. |
|
Source
A Source represents an input data set that is an input to one or more
MapReduce jobs. |
|
SourceTarget
An interface for classes that implement both the Source and the
Target interfaces. |
|
TableSource
The interface Source implementations that return a PTable . |
|
Target
A Target represents the output destination of a Crunch PCollection
in the context of a Crunch job. |
|
Target.WriteMode
An enum to represent different options the client may specify for handling the case where the output path, table, etc. |
Classes in org.apache.crunch used by org.apache.crunch.impl.dist.collect | |
---|---|
Aggregator
Aggregate a sequence of values into a possibly smaller sequence of the same type. |
|
CachingOptions
Options for controlling how a PCollection<T> is cached for subsequent processing. |
|
CombineFn
A special DoFn implementation that converts an Iterable of
values into a single value. |
|
DoFn
Base class for all data processing functions in Crunch. |
|
FilterFn
A DoFn for the common case of filtering the members of a
PCollection based on a boolean condition. |
|
GroupingOptions
Options that can be passed to a groupByKey operation in order to
exercise finer control over how the partitioning, grouping, and sorting of
keys is performed. |
|
MapFn
A DoFn for the common case of emitting exactly one value for each
input record. |
|
Pair
A convenience class for two-element Tuple s. |
|
ParallelDoOptions
Container class that includes optional information about a parallelDo operation
applied to a PCollection . |
|
PCollection
A representation of an immutable, distributed collection of elements that is the fundamental target of computations in Crunch. |
|
PGroupedTable
The Crunch representation of a grouped PTable , which corresponds to the output of
the shuffle phase of a MapReduce job. |
|
PObject
A PObject represents a singleton object value that results from a distributed
computation. |
|
PTable
A sub-interface of PCollection that represents an immutable,
distributed multi-map of keys and values. |
|
ReadableData
Represents the contents of a data source that can be read on the cluster from within one of the tasks running as part of a Crunch pipeline. |
|
Source
A Source represents an input data set that is an input to one or more
MapReduce jobs. |
|
SourceTarget
An interface for classes that implement both the Source and the
Target interfaces. |
|
TableSource
The interface Source implementations that return a PTable . |
|
Target
A Target represents the output destination of a Crunch PCollection
in the context of a Crunch job. |
|
Target.WriteMode
An enum to represent different options the client may specify for handling the case where the output path, table, etc. |
Classes in org.apache.crunch used by org.apache.crunch.impl.mem | |
---|---|
CachingOptions
Options for controlling how a PCollection<T> is cached for subsequent processing. |
|
Pair
A convenience class for two-element Tuple s. |
|
PCollection
A representation of an immutable, distributed collection of elements that is the fundamental target of computations in Crunch. |
|
Pipeline
Manages the state of a pipeline execution. |
|
PipelineExecution
A handle to allow clients to control a Crunch pipeline as it runs. |
|
PipelineResult
Container for the results of a call to run or done on the
Pipeline interface that includes details and statistics about the component
stages of the data pipeline. |
|
PTable
A sub-interface of PCollection that represents an immutable,
distributed multi-map of keys and values. |
|
Source
A Source represents an input data set that is an input to one or more
MapReduce jobs. |
|
TableSource
The interface Source implementations that return a PTable . |
|
Target
A Target represents the output destination of a Crunch PCollection
in the context of a Crunch job. |
|
Target.WriteMode
An enum to represent different options the client may specify for handling the case where the output path, table, etc. |
Classes in org.apache.crunch used by org.apache.crunch.impl.mr | |
---|---|
CachingOptions
Options for controlling how a PCollection<T> is cached for subsequent processing. |
|
PCollection
A representation of an immutable, distributed collection of elements that is the fundamental target of computations in Crunch. |
|
Pipeline
Manages the state of a pipeline execution. |
|
PipelineExecution
A handle to allow clients to control a Crunch pipeline as it runs. |
|
PipelineResult
Container for the results of a call to run or done on the
Pipeline interface that includes details and statistics about the component
stages of the data pipeline. |
Classes in org.apache.crunch used by org.apache.crunch.impl.spark | |
---|---|
CachingOptions
Options for controlling how a PCollection<T> is cached for subsequent processing. |
|
CombineFn
A special DoFn implementation that converts an Iterable of
values into a single value. |
|
DoFn
Base class for all data processing functions in Crunch. |
|
GroupingOptions
Options that can be passed to a groupByKey operation in order to
exercise finer control over how the partitioning, grouping, and sorting of
keys is performed. |
|
Pair
A convenience class for two-element Tuple s. |
|
PCollection
A representation of an immutable, distributed collection of elements that is the fundamental target of computations in Crunch. |
|
Pipeline
Manages the state of a pipeline execution. |
|
PipelineExecution
A handle to allow clients to control a Crunch pipeline as it runs. |
|
PipelineExecution.Status
|
|
PipelineResult
Container for the results of a call to run or done on the
Pipeline interface that includes details and statistics about the component
stages of the data pipeline. |
|
PTable
A sub-interface of PCollection that represents an immutable,
distributed multi-map of keys and values. |
|
Target
A Target represents the output destination of a Crunch PCollection
in the context of a Crunch job. |
Classes in org.apache.crunch used by org.apache.crunch.impl.spark.collect | |
---|---|
CombineFn
A special DoFn implementation that converts an Iterable of
values into a single value. |
|
DoFn
Base class for all data processing functions in Crunch. |
|
GroupingOptions
Options that can be passed to a groupByKey operation in order to
exercise finer control over how the partitioning, grouping, and sorting of
keys is performed. |
|
Pair
A convenience class for two-element Tuple s. |
|
ParallelDoOptions
Container class that includes optional information about a parallelDo operation
applied to a PCollection . |
|
PCollection
A representation of an immutable, distributed collection of elements that is the fundamental target of computations in Crunch. |
|
PGroupedTable
The Crunch representation of a grouped PTable , which corresponds to the output of
the shuffle phase of a MapReduce job. |
|
PTable
A sub-interface of PCollection that represents an immutable,
distributed multi-map of keys and values. |
|
Source
A Source represents an input data set that is an input to one or more
MapReduce jobs. |
|
TableSource
The interface Source implementations that return a PTable . |
Classes in org.apache.crunch used by org.apache.crunch.impl.spark.fn | |
---|---|
CombineFn
A special DoFn implementation that converts an Iterable of
values into a single value. |
|
DoFn
Base class for all data processing functions in Crunch. |
|
GroupingOptions
Options that can be passed to a groupByKey operation in order to
exercise finer control over how the partitioning, grouping, and sorting of
keys is performed. |
|
MapFn
A DoFn for the common case of emitting exactly one value for each
input record. |
|
Pair
A convenience class for two-element Tuple s. |
Classes in org.apache.crunch used by org.apache.crunch.io | |
---|---|
ReadableData
Represents the contents of a data source that can be read on the cluster from within one of the tasks running as part of a Crunch pipeline. |
|
Source
A Source represents an input data set that is an input to one or more
MapReduce jobs. |
|
SourceTarget
An interface for classes that implement both the Source and the
Target interfaces. |
|
TableSource
The interface Source implementations that return a PTable . |
|
TableSourceTarget
An interface for classes that implement both the TableSource and the
Target interfaces. |
|
Target
A Target represents the output destination of a Crunch PCollection
in the context of a Crunch job. |
Classes in org.apache.crunch used by org.apache.crunch.io.impl | |
---|---|
Source
A Source represents an input data set that is an input to one or more
MapReduce jobs. |
Classes in org.apache.crunch used by org.apache.crunch.lib | |
---|---|
Aggregator
Aggregate a sequence of values into a possibly smaller sequence of the same type. |
|
CombineFn
A special DoFn implementation that converts an Iterable of
values into a single value. |
|
DoFn
Base class for all data processing functions in Crunch. |
|
Emitter
Interface for writing outputs from a DoFn . |
|
MapFn
A DoFn for the common case of emitting exactly one value for each
input record. |
|
Pair
A convenience class for two-element Tuple s. |
|
PCollection
A representation of an immutable, distributed collection of elements that is the fundamental target of computations in Crunch. |
|
PGroupedTable
The Crunch representation of a grouped PTable , which corresponds to the output of
the shuffle phase of a MapReduce job. |
|
PObject
A PObject represents a singleton object value that results from a distributed
computation. |
|
PTable
A sub-interface of PCollection that represents an immutable,
distributed multi-map of keys and values. |
|
Tuple
A fixed-size collection of Objects, used in Crunch for representing joins between PCollection s. |
|
Tuple3
A convenience class for three-element Tuple s. |
|
Tuple3.Collect
|
|
Tuple4
A convenience class for four-element Tuple s. |
|
Tuple4.Collect
|
|
TupleN
A Tuple instance for an arbitrary number of values. |
Classes in org.apache.crunch used by org.apache.crunch.lib.join | |
---|---|
DoFn
Base class for all data processing functions in Crunch. |
|
Emitter
Interface for writing outputs from a DoFn . |
|
Pair
A convenience class for two-element Tuple s. |
|
PCollection
A representation of an immutable, distributed collection of elements that is the fundamental target of computations in Crunch. |
|
PTable
A sub-interface of PCollection that represents an immutable,
distributed multi-map of keys and values. |
Classes in org.apache.crunch used by org.apache.crunch.lib.sort | |
---|---|
DoFn
Base class for all data processing functions in Crunch. |
|
MapFn
A DoFn for the common case of emitting exactly one value for each
input record. |
|
Tuple
A fixed-size collection of Objects, used in Crunch for representing joins between PCollection s. |
Classes in org.apache.crunch used by org.apache.crunch.types | |
---|---|
DoFn
Base class for all data processing functions in Crunch. |
|
GroupingOptions
Options that can be passed to a groupByKey operation in order to
exercise finer control over how the partitioning, grouping, and sorting of
keys is performed. |
|
MapFn
A DoFn for the common case of emitting exactly one value for each
input record. |
|
Pair
A convenience class for two-element Tuple s. |
|
Tuple
A fixed-size collection of Objects, used in Crunch for representing joins between PCollection s. |
|
Tuple3
A convenience class for three-element Tuple s. |
|
Tuple4
A convenience class for four-element Tuple s. |
|
TupleN
A Tuple instance for an arbitrary number of values. |
|
Union
Allows us to represent the combination of multiple data sources that may contain different types of data as a single type with an index to indicate which of the original sources the current record was from. |
Classes in org.apache.crunch used by org.apache.crunch.types.avro | |
---|---|
MapFn
A DoFn for the common case of emitting exactly one value for each
input record. |
|
Pair
A convenience class for two-element Tuple s. |
|
Tuple
A fixed-size collection of Objects, used in Crunch for representing joins between PCollection s. |
|
Tuple3
A convenience class for three-element Tuple s. |
|
Tuple4
A convenience class for four-element Tuple s. |
|
TupleN
A Tuple instance for an arbitrary number of values. |
|
Union
Allows us to represent the combination of multiple data sources that may contain different types of data as a single type with an index to indicate which of the original sources the current record was from. |
Classes in org.apache.crunch used by org.apache.crunch.types.writable | |
---|---|
MapFn
A DoFn for the common case of emitting exactly one value for each
input record. |
|
Pair
A convenience class for two-element Tuple s. |
|
Tuple
A fixed-size collection of Objects, used in Crunch for representing joins between PCollection s. |
|
Tuple3
A convenience class for three-element Tuple s. |
|
Tuple4
A convenience class for four-element Tuple s. |
|
TupleN
A Tuple instance for an arbitrary number of values. |
|
Union
Allows us to represent the combination of multiple data sources that may contain different types of data as a single type with an index to indicate which of the original sources the current record was from. |
Classes in org.apache.crunch used by org.apache.crunch.util | |
---|---|
DoFn
Base class for all data processing functions in Crunch. |
|
Pair
A convenience class for two-element Tuple s. |
|
PCollection
A representation of an immutable, distributed collection of elements that is the fundamental target of computations in Crunch. |
|
PipelineExecution
A handle to allow clients to control a Crunch pipeline as it runs. |
|
PipelineResult
Container for the results of a call to run or done on the
Pipeline interface that includes details and statistics about the component
stages of the data pipeline. |
|
PTable
A sub-interface of PCollection that represents an immutable,
distributed multi-map of keys and values. |
|
ReadableData
Represents the contents of a data source that can be read on the cluster from within one of the tasks running as part of a Crunch pipeline. |
|
Source
A Source represents an input data set that is an input to one or more
MapReduce jobs. |
|
SourceTarget
An interface for classes that implement both the Source and the
Target interfaces. |
|
TableSource
The interface Source implementations that return a PTable . |
|
Target
A Target represents the output destination of a Crunch PCollection
in the context of a Crunch job. |
|
Tuple3
A convenience class for three-element Tuple s. |
|
Tuple4
A convenience class for four-element Tuple s. |
|
TupleN
A Tuple instance for an arbitrary number of values. |
|
|||||||||
PREV NEXT | FRAMES NO FRAMES |