Package org.apache.crunch

Client-facing API and core abstractions.

See:
          Description

Interface Summary
CombineFn.Aggregator<T>  
CombineFn.AggregatorFactory<T> Interface for constructing new aggregator instances.
Emitter<T> Interface for writing outputs from a DoFn.
PCollection<S> A representation of an immutable, distributed collection of elements that is the fundamental target of computations in Crunch.
PGroupedTable<K,V> The Crunch representation of a grouped PTable.
Pipeline Manages the state of a pipeline execution.
PObject<T> A PObject represents a singleton object value that results from a distributed computation.
PTable<K,V> A sub-interface of PCollection that represents an immutable, distributed multi-map of keys and values.
Source<T> A Source represents an input data set that is an input to one or more MapReduce jobs.
SourceTarget<T> An interface for classes that implement both the Source and the Target interfaces.
TableSource<K,V> The interface Source implementations that return a PTable.
Target A Target represents the output destination of a Crunch job.
Tuple A fixed-size collection of Objects, used in Crunch for representing joins between PCollections.
 

Class Summary
CombineFn<S,T> A special DoFn implementation that converts an Iterable of values into a single value.
CombineFn.AggregatorCombineFn<K,V> A CombineFn that delegates all of the actual work to an Aggregator instance.
CombineFn.FirstNAggregator<V>  
CombineFn.LastNAggregator<V>  
CombineFn.MaxBigInts  
CombineFn.MaxDoubles  
CombineFn.MaxFloats  
CombineFn.MaxInts  
CombineFn.MaxLongs  
CombineFn.MaxNAggregator<V extends Comparable<V>>  
CombineFn.MinBigInts  
CombineFn.MinDoubles  
CombineFn.MinFloats  
CombineFn.MinInts  
CombineFn.MinLongs  
CombineFn.MinNAggregator<V extends Comparable<V>>  
CombineFn.PairAggregator<V1,V2>  
CombineFn.QuadAggregator<A,B,C,D>  
CombineFn.SimpleAggregator<T> Base class for aggregators that do not require any initialization.
CombineFn.StringConcatAggregator  
CombineFn.SumBigInts  
CombineFn.SumDoubles  
CombineFn.SumFloats  
CombineFn.SumInts  
CombineFn.SumLongs  
CombineFn.TripAggregator<A,B,C>  
CombineFn.TupleNAggregator  
DoFn<S,T> Base class for all data processing functions in Crunch.
FilterFn<T> A DoFn for the common case of filtering the members of a PCollection based on a boolean condition.
FilterFn.AndFn<S>  
FilterFn.NotFn<S>  
FilterFn.OrFn<S>  
GroupingOptions Options that can be passed to a groupByKey operation in order to exercise finer control over how the partitioning, grouping, and sorting of keys is performed.
GroupingOptions.Builder Builder class for creating GroupingOptions instances.
MapFn<S,T> A DoFn for the common case of emitting exactly one value for each input record.
Pair<K,V> A convenience class for two-element Tuples.
PipelineResult Container for the results of a call to run or done on the Pipeline interface that includes details and statistics about the component stages of the data pipeline.
PipelineResult.StageResult  
Tuple3<V1,V2,V3> A convenience class for three-element Tuples.
Tuple4<V1,V2,V3,V4> A convenience class for four-element Tuples.
TupleN A Tuple instance for an arbitrary number of values.
 

Package org.apache.crunch Description

Client-facing API and core abstractions.

See Also:
Introduction to Apache Crunch


Copyright © 2012 The Apache Software Foundation. All Rights Reserved.