This project has retired. For details please refer to its Attic page.
Uses of Interface org.apache.crunch.PCollection (Apache Crunch 0.3.0-incubating API)

Uses of Interface
org.apache.crunch.PCollection

Packages that use PCollection
org.apache.crunch   
org.apache.crunch.examples   
org.apache.crunch.impl.mem   
org.apache.crunch.impl.mem.collect   
org.apache.crunch.impl.mr   
org.apache.crunch.impl.mr.collect   
org.apache.crunch.lib   
org.apache.crunch.tool   
 

Uses of PCollection in org.apache.crunch
 

Subinterfaces of PCollection in org.apache.crunch
 interface PGroupedTable<K,V>
          The Crunch representation of a grouped PTable.
 interface PTable<K,V>
          A sub-interface of PCollection that represents an immutable, distributed multi-map of keys and values.
 

Methods in org.apache.crunch that return PCollection
 PCollection<S> PCollection.filter(FilterFn<S> filterFn)
          Apply the given filter function to this instance and return the resulting PCollection.
 PCollection<S> PCollection.filter(String name, FilterFn<S> filterFn)
          Apply the given filter function to this instance and return the resulting PCollection.
 PCollection<K> PTable.keys()
          Returns a PCollection made up of the keys in this PTable.
 PCollection<S> PCollection.max()
          Returns a PCollection made up of only the maximum element of this instance.
 PCollection<S> PCollection.min()
          Returns a PCollection made up of only the minimum element of this instance.
<T> PCollection<T>
PCollection.parallelDo(DoFn<S,T> doFn, PType<T> type)
          Applies the given doFn to the elements of this PCollection and returns a new PCollection that is the output of this processing.
<T> PCollection<T>
PCollection.parallelDo(String name, DoFn<S,T> doFn, PType<T> type)
          Applies the given doFn to the elements of this PCollection and returns a new PCollection that is the output of this processing.
<T> PCollection<T>
Pipeline.read(Source<T> source)
          Converts the given Source into a PCollection that is available to jobs run using this Pipeline instance.
 PCollection<String> Pipeline.readTextFile(String pathName)
          A convenience method for reading a text file.
 PCollection<S> PCollection.sample(double acceptanceProbability)
          Randomly sample items from this PCollection instance with the given probability of an item being accepted.
 PCollection<S> PCollection.sample(double acceptanceProbability, long seed)
          Randomly sample items from this PCollection instance with the given probability of an item being accepted and using the given seed.
 PCollection<S> PCollection.sort(boolean ascending)
          Returns a PCollection instance that contains all of the elements of this instance in sorted order.
 PCollection<S> PCollection.union(PCollection<S>... collections)
          Returns a PCollection instance that acts as the union of this PCollection and the input PCollections.
 PCollection<V> PTable.values()
          Returns a PCollection made up of the values in this PTable.
 PCollection<S> PCollection.write(Target target)
          Write the contents of this PCollection to the given Target, using the storage format specified by the target.
 

Methods in org.apache.crunch with parameters of type PCollection
<T> Iterable<T>
Pipeline.materialize(PCollection<T> pcollection)
          Create the given PCollection and read the data it contains into the returned Collection instance for client use.
 PCollection<S> PCollection.union(PCollection<S>... collections)
          Returns a PCollection instance that acts as the union of this PCollection and the input PCollections.
 void Pipeline.write(PCollection<?> collection, Target target)
          Write the given collection to the given target on the next pipeline run.
<T> void
Pipeline.writeTextFile(PCollection<T> collection, String pathName)
          A convenience method for writing a text file.
 

Uses of PCollection in org.apache.crunch.examples
 

Methods in org.apache.crunch.examples that return PCollection
 PCollection<org.apache.hadoop.hbase.client.Put> WordAggregationHBase.createPut(PTable<String,String> extractedText)
          Create puts in order to insert them in hbase.
 

Uses of PCollection in org.apache.crunch.impl.mem
 

Methods in org.apache.crunch.impl.mem that return PCollection
static
<T> PCollection<T>
MemPipeline.collectionOf(Iterable<T> collect)
           
static
<T> PCollection<T>
MemPipeline.collectionOf(T... ts)
           
<T> PCollection<T>
MemPipeline.read(Source<T> source)
           
 PCollection<String> MemPipeline.readTextFile(String pathName)
           
static
<T> PCollection<T>
MemPipeline.typedCollectionOf(PType<T> ptype, Iterable<T> collect)
           
static
<T> PCollection<T>
MemPipeline.typedCollectionOf(PType<T> ptype, T... ts)
           
 

Methods in org.apache.crunch.impl.mem with parameters of type PCollection
<T> Iterable<T>
MemPipeline.materialize(PCollection<T> pcollection)
           
 void MemPipeline.write(PCollection<?> collection, Target target)
           
<T> void
MemPipeline.writeTextFile(PCollection<T> collection, String pathName)
           
 

Uses of PCollection in org.apache.crunch.impl.mem.collect
 

Classes in org.apache.crunch.impl.mem.collect that implement PCollection
 class MemCollection<S>
           
 class MemTable<K,V>
           
 

Methods in org.apache.crunch.impl.mem.collect that return PCollection
 PCollection<S> MemCollection.filter(FilterFn<S> filterFn)
           
 PCollection<S> MemCollection.filter(String name, FilterFn<S> filterFn)
           
 PCollection<K> MemTable.keys()
           
 PCollection<S> MemCollection.max()
           
 PCollection<S> MemCollection.min()
           
<T> PCollection<T>
MemCollection.parallelDo(DoFn<S,T> doFn, PType<T> type)
           
<T> PCollection<T>
MemCollection.parallelDo(String name, DoFn<S,T> doFn, PType<T> type)
           
 PCollection<S> MemCollection.sample(double acceptanceProbability)
           
 PCollection<S> MemCollection.sample(double acceptanceProbability, long seed)
           
 PCollection<S> MemCollection.sort(boolean ascending)
           
 PCollection<S> MemCollection.union(PCollection<S>... collections)
           
 PCollection<V> MemTable.values()
           
 PCollection<S> MemCollection.write(Target target)
           
 

Methods in org.apache.crunch.impl.mem.collect with parameters of type PCollection
 PCollection<S> MemCollection.union(PCollection<S>... collections)
           
 

Uses of PCollection in org.apache.crunch.impl.mr
 

Methods in org.apache.crunch.impl.mr that return PCollection
<S> PCollection<S>
MRPipeline.read(Source<S> source)
           
 PCollection<String> MRPipeline.readTextFile(String pathName)
           
 

Methods in org.apache.crunch.impl.mr with parameters of type PCollection
<T> ReadableSourceTarget<T>
MRPipeline.getMaterializeSourceTarget(PCollection<T> pcollection)
          Retrieve a ReadableSourceTarget that provides access to the contents of a PCollection.
<T> Iterable<T>
MRPipeline.materialize(PCollection<T> pcollection)
           
 void MRPipeline.write(PCollection<?> pcollection, Target target)
           
<T> void
MRPipeline.writeTextFile(PCollection<T> pcollection, String pathName)
           
 

Uses of PCollection in org.apache.crunch.impl.mr.collect
 

Classes in org.apache.crunch.impl.mr.collect that implement PCollection
 class DoCollectionImpl<S>
           
 class DoTableImpl<K,V>
           
 class InputCollection<S>
           
 class InputTable<K,V>
           
 class PCollectionImpl<S>
           
 class PGroupedTableImpl<K,V>
           
 class PTableBase<K,V>
           
 class UnionCollection<S>
           
 class UnionTable<K,V>
           
 

Methods in org.apache.crunch.impl.mr.collect that return PCollection
 PCollection<S> PCollectionImpl.filter(FilterFn<S> filterFn)
           
 PCollection<S> PCollectionImpl.filter(String name, FilterFn<S> filterFn)
           
 PCollection<K> PTableBase.keys()
           
 PCollection<S> PCollectionImpl.max()
           
 PCollection<S> PCollectionImpl.min()
           
<T> PCollection<T>
PCollectionImpl.parallelDo(DoFn<S,T> fn, PType<T> type)
           
<T> PCollection<T>
PCollectionImpl.parallelDo(String name, DoFn<S,T> fn, PType<T> type)
           
 PCollection<S> PCollectionImpl.sample(double acceptanceProbability)
           
 PCollection<S> PCollectionImpl.sample(double acceptanceProbability, long seed)
           
 PCollection<S> PCollectionImpl.sort(boolean ascending)
           
 PCollection<S> PCollectionImpl.union(PCollection<S>... collections)
           
 PCollection<V> PTableBase.values()
           
 PCollection<S> PCollectionImpl.write(Target target)
           
 

Methods in org.apache.crunch.impl.mr.collect with parameters of type PCollection
 PCollection<S> PCollectionImpl.union(PCollection<S>... collections)
           
 

Uses of PCollection in org.apache.crunch.lib
 

Methods in org.apache.crunch.lib that return PCollection
static
<T> PCollection<Tuple3<T,T,T>>
Set.comm(PCollection<T> coll1, PCollection<T> coll2)
          Find the elements that are common to two sets, like the Unix comm utility.
static
<U,V> PCollection<Pair<U,V>>
Cartesian.cross(PCollection<U> left, PCollection<V> right)
          Performs a full cross join on the specified PCollections (using the same strategy as Pig's CROSS operator).
static
<U,V> PCollection<Pair<U,V>>
Cartesian.cross(PCollection<U> left, PCollection<V> right, int parallelism)
          Performs a full cross join on the specified PCollections (using the same strategy as Pig's CROSS operator).
static
<T> PCollection<T>
Set.difference(PCollection<T> coll1, PCollection<T> coll2)
          Compute the set difference between two sets of elements.
static
<T> PCollection<T>
Set.intersection(PCollection<T> coll1, PCollection<T> coll2)
          Compute the intersection of two sets of elements.
static
<K,V> PCollection<K>
PTables.keys(PTable<K,V> ptable)
           
static
<S> PCollection<S>
Aggregate.max(PCollection<S> collect)
          Returns the largest numerical element from the input collection.
static
<S> PCollection<S>
Aggregate.min(PCollection<S> collect)
          Returns the smallest numerical element from the input collection.
static
<S> PCollection<S>
Sample.sample(PCollection<S> input, double probability)
           
static
<S> PCollection<S>
Sample.sample(PCollection<S> input, long seed, double probability)
           
static
<T> PCollection<T>
Sort.sort(PCollection<T> collection)
          Sorts the PCollection using the natural ordering of its elements.
static
<T> PCollection<T>
Sort.sort(PCollection<T> collection, Sort.Order order)
          Sorts the PCollection using the natural ordering of its elements in the order specified.
static
<U,V> PCollection<Pair<U,V>>
Sort.sortPairs(PCollection<Pair<U,V>> collection, Sort.ColumnOrder... columnOrders)
          Sorts the PCollection of Pairs using the specified column ordering.
static
<V1,V2,V3,V4>
PCollection<Tuple4<V1,V2,V3,V4>>
Sort.sortQuads(PCollection<Tuple4<V1,V2,V3,V4>> collection, Sort.ColumnOrder... columnOrders)
          Sorts the PCollection of Tuple4s using the specified column ordering.
static
<V1,V2,V3> PCollection<Tuple3<V1,V2,V3>>
Sort.sortTriples(PCollection<Tuple3<V1,V2,V3>> collection, Sort.ColumnOrder... columnOrders)
          Sorts the PCollection of Tuple3s using the specified column ordering.
static PCollection<TupleN> Sort.sortTuples(PCollection<TupleN> collection, Sort.ColumnOrder... columnOrders)
          Sorts the PCollection of TupleNs using the specified column ordering.
static
<K,V> PCollection<V>
PTables.values(PTable<K,V> ptable)
           
 

Methods in org.apache.crunch.lib with parameters of type PCollection
static
<T> PCollection<Tuple3<T,T,T>>
Set.comm(PCollection<T> coll1, PCollection<T> coll2)
          Find the elements that are common to two sets, like the Unix comm utility.
static
<T> PCollection<Tuple3<T,T,T>>
Set.comm(PCollection<T> coll1, PCollection<T> coll2)
          Find the elements that are common to two sets, like the Unix comm utility.
static
<S> PTable<S,Long>
Aggregate.count(PCollection<S> collect)
          Returns a PTable that contains the unique elements of this collection mapped to a count of their occurrences.
static
<U,V> PCollection<Pair<U,V>>
Cartesian.cross(PCollection<U> left, PCollection<V> right)
          Performs a full cross join on the specified PCollections (using the same strategy as Pig's CROSS operator).
static
<U,V> PCollection<Pair<U,V>>
Cartesian.cross(PCollection<U> left, PCollection<V> right)
          Performs a full cross join on the specified PCollections (using the same strategy as Pig's CROSS operator).
static
<U,V> PCollection<Pair<U,V>>
Cartesian.cross(PCollection<U> left, PCollection<V> right, int parallelism)
          Performs a full cross join on the specified PCollections (using the same strategy as Pig's CROSS operator).
static
<U,V> PCollection<Pair<U,V>>
Cartesian.cross(PCollection<U> left, PCollection<V> right, int parallelism)
          Performs a full cross join on the specified PCollections (using the same strategy as Pig's CROSS operator).
static
<T> PCollection<T>
Set.difference(PCollection<T> coll1, PCollection<T> coll2)
          Compute the set difference between two sets of elements.
static
<T> PCollection<T>
Set.difference(PCollection<T> coll1, PCollection<T> coll2)
          Compute the set difference between two sets of elements.
static
<T> PCollection<T>
Set.intersection(PCollection<T> coll1, PCollection<T> coll2)
          Compute the intersection of two sets of elements.
static
<T> PCollection<T>
Set.intersection(PCollection<T> coll1, PCollection<T> coll2)
          Compute the intersection of two sets of elements.
static
<S> PCollection<S>
Aggregate.max(PCollection<S> collect)
          Returns the largest numerical element from the input collection.
static
<S> PCollection<S>
Aggregate.min(PCollection<S> collect)
          Returns the smallest numerical element from the input collection.
static
<S> PCollection<S>
Sample.sample(PCollection<S> input, double probability)
           
static
<S> PCollection<S>
Sample.sample(PCollection<S> input, long seed, double probability)
           
static
<T> PCollection<T>
Sort.sort(PCollection<T> collection)
          Sorts the PCollection using the natural ordering of its elements.
static
<T> PCollection<T>
Sort.sort(PCollection<T> collection, Sort.Order order)
          Sorts the PCollection using the natural ordering of its elements in the order specified.
static
<U,V> PCollection<Pair<U,V>>
Sort.sortPairs(PCollection<Pair<U,V>> collection, Sort.ColumnOrder... columnOrders)
          Sorts the PCollection of Pairs using the specified column ordering.
static
<V1,V2,V3,V4>
PCollection<Tuple4<V1,V2,V3,V4>>
Sort.sortQuads(PCollection<Tuple4<V1,V2,V3,V4>> collection, Sort.ColumnOrder... columnOrders)
          Sorts the PCollection of Tuple4s using the specified column ordering.
static
<V1,V2,V3> PCollection<Tuple3<V1,V2,V3>>
Sort.sortTriples(PCollection<Tuple3<V1,V2,V3>> collection, Sort.ColumnOrder... columnOrders)
          Sorts the PCollection of Tuple3s using the specified column ordering.
static PCollection<TupleN> Sort.sortTuples(PCollection<TupleN> collection, Sort.ColumnOrder... columnOrders)
          Sorts the PCollection of TupleNs using the specified column ordering.
 

Uses of PCollection in org.apache.crunch.tool
 

Methods in org.apache.crunch.tool that return PCollection
<T> PCollection<T>
CrunchTool.read(Source<T> source)
           
 PCollection<String> CrunchTool.readTextFile(String pathName)
           
 

Methods in org.apache.crunch.tool with parameters of type PCollection
 void CrunchTool.write(PCollection<?> pcollection, Target target)
           
 void CrunchTool.writeTextFile(PCollection<?> pcollection, String pathName)
           
 



Copyright © 2012 The Apache Software Foundation. All Rights Reserved.