This project has retired. For details please refer to its Attic page.
org.apache.crunch (Apache Crunch 0.6.0 API)

Package org.apache.crunch

Client-facing API and core abstractions.

See:
          Description

Interface Summary
Aggregator<T> Aggregate a sequence of values into a possibly smaller sequence of the same type.
CombineFn.Aggregator<T> Deprecated. Use Aggregator
CombineFn.AggregatorFactory<T> Deprecated. Use PGroupedTable.combineValues(Aggregator) which doesn't require a factory.
Emitter<T> Interface for writing outputs from a DoFn.
PCollection<S> A representation of an immutable, distributed collection of elements that is the fundamental target of computations in Crunch.
PGroupedTable<K,V> The Crunch representation of a grouped PTable.
Pipeline Manages the state of a pipeline execution.
PipelineExecution A handle to allow clients to control a Crunch pipeline as it runs.
PObject<T> A PObject represents a singleton object value that results from a distributed computation.
PTable<K,V> A sub-interface of PCollection that represents an immutable, distributed multi-map of keys and values.
Source<T> A Source represents an input data set that is an input to one or more MapReduce jobs.
SourceTarget<T> An interface for classes that implement both the Source and the Target interfaces.
TableSource<K,V> The interface Source implementations that return a PTable.
TableSourceTarget<K,V> An interface for classes that implement both the TableSource and the Target interfaces.
Target A Target represents the output destination of a Crunch PCollection in the context of a Crunch job.
Tuple A fixed-size collection of Objects, used in Crunch for representing joins between PCollections.
 

Class Summary
CombineFn<S,T> A special DoFn implementation that converts an Iterable of values into a single value.
CombineFn.AggregatorCombineFn<K,V> Deprecated. Use the Aggregators.toCombineFn(org.apache.crunch.Aggregator) adapter
CombineFn.FirstNAggregator<V> Deprecated. Use Aggregators.FIRST_N(int)
CombineFn.LastNAggregator<V> Deprecated. Use Aggregators.LAST_N(int)
CombineFn.MaxBigInts Deprecated. Use Aggregators.MAX_BIGINTS()
CombineFn.MaxDoubles Deprecated. Use Aggregators.MAX_DOUBLES()
CombineFn.MaxFloats Deprecated. Use Aggregators.MAX_FLOATS()
CombineFn.MaxInts Deprecated. Use Aggregators.MAX_INTS()
CombineFn.MaxLongs Deprecated. Use Aggregators.MAX_LONGS()
CombineFn.MaxNAggregator<V extends Comparable<V>> Deprecated. Use Aggregators.MAX_N(int, Class)
CombineFn.MinBigInts Deprecated. Use Aggregators.MIN_BIGINTS()
CombineFn.MinDoubles Deprecated. Use Aggregators.MIN_DOUBLES()
CombineFn.MinFloats Deprecated. Use Aggregators.MIN_FLOATS()
CombineFn.MinInts Deprecated. Use Aggregators.MIN_INTS()
CombineFn.MinLongs Deprecated. Use Aggregators.MIN_LONGS()
CombineFn.MinNAggregator<V extends Comparable<V>> Deprecated. Use Aggregators.MIN_N(int, Class)
CombineFn.PairAggregator<V1,V2> Deprecated. Use Aggregators.pairAggregator(Aggregator, Aggregator)
CombineFn.QuadAggregator<A,B,C,D> Deprecated. Use Aggregators.quadAggregator(Aggregator, Aggregator, Aggregator, Aggregator)
CombineFn.SimpleAggregator<T> Deprecated. Use Aggregators.SimpleAggregator
CombineFn.StringConcatAggregator Deprecated. Use Aggregators.STRING_CONCAT(String, boolean, long, long)
CombineFn.SumBigInts Deprecated. Use Aggregators.SUM_BIGINTS()
CombineFn.SumDoubles Deprecated. Use Aggregators.SUM_DOUBLES()
CombineFn.SumFloats Deprecated. Use Aggregators.SUM_FLOATS()
CombineFn.SumInts Deprecated. Use Aggregators.SUM_INTS()
CombineFn.SumLongs Deprecated. Use Aggregators.SUM_LONGS()
CombineFn.TripAggregator<A,B,C> Deprecated. Use Aggregators.tripAggregator(Aggregator, Aggregator, Aggregator)
CombineFn.TupleNAggregator Deprecated. Use Aggregators.tupleAggregator(Aggregator...)
DoFn<S,T> Base class for all data processing functions in Crunch.
FilterFn<T> A DoFn for the common case of filtering the members of a PCollection based on a boolean condition.
FilterFn.AndFn<S> Deprecated. Use FilterFns.and(FilterFn...)
FilterFn.NotFn<S> Deprecated. Use FilterFns.not(FilterFn)
FilterFn.OrFn<S> Deprecated. Use FilterFns.or(FilterFn...)
GroupingOptions Options that can be passed to a groupByKey operation in order to exercise finer control over how the partitioning, grouping, and sorting of keys is performed.
GroupingOptions.Builder Builder class for creating GroupingOptions instances.
MapFn<S,T> A DoFn for the common case of emitting exactly one value for each input record.
Pair<K,V> A convenience class for two-element Tuples.
ParallelDoOptions Container class that includes optional information about a parallelDo operation applied to a PCollection.
ParallelDoOptions.Builder  
PipelineResult Container for the results of a call to run or done on the Pipeline interface that includes details and statistics about the component stages of the data pipeline.
PipelineResult.StageResult  
Tuple3<V1,V2,V3> A convenience class for three-element Tuples.
Tuple4<V1,V2,V3,V4> A convenience class for four-element Tuples.
TupleN A Tuple instance for an arbitrary number of values.
 

Enum Summary
PipelineExecution.Status  
Target.WriteMode An enum to represent different options the client may specify for handling the case where the output path, table, etc.
 

Exception Summary
CrunchRuntimeException A RuntimeException implementation that includes some additional options for the Crunch execution engine to track reporting status.
 

Package org.apache.crunch Description

Client-facing API and core abstractions.

See Also:
Introduction to Apache Crunch


Copyright © 2013 The Apache Software Foundation. All Rights Reserved.