| Package | Description |
|---|---|
| org.apache.crunch |
Client-facing API and core abstractions.
|
| org.apache.crunch.impl.dist | |
| org.apache.crunch.impl.dist.collect | |
| org.apache.crunch.impl.mem |
In-memory Pipeline implementation for rapid prototyping and testing.
|
| org.apache.crunch.impl.spark | |
| org.apache.crunch.io |
Data input and output for Pipelines.
|
| org.apache.crunch.util |
An assorted set of utilities.
|
| Modifier and Type | Interface and Description |
|---|---|
interface |
SourceTarget<T>
An interface for classes that implement both the
Source and the
Target interfaces. |
interface |
TableSourceTarget<K,V>
An interface for classes that implement both the
TableSource and the
Target interfaces. |
| Modifier and Type | Method and Description |
|---|---|
Target |
Target.outputConf(String key,
String value)
Adds the given key-value pair to the
Configuration instance that is used to write
this Target. |
| Modifier and Type | Method and Description |
|---|---|
Map<String,Target> |
PipelineCallable.getAllTargets()
Returns the mapping of labels to Target dependencies for this instance.
|
Set<Target> |
ParallelDoOptions.getTargets() |
| Modifier and Type | Method and Description |
|---|---|
PipelineCallable<Output> |
PipelineCallable.dependsOn(String label,
Target t)
Requires that the given
Target exists before this instance may be
executed. |
ParallelDoOptions.Builder |
ParallelDoOptions.Builder.targets(Target... targets) |
void |
Pipeline.write(PCollection<?> collection,
Target target)
Write the given collection to the given target on the next pipeline run.
|
void |
Pipeline.write(PCollection<?> collection,
Target target,
Target.WriteMode writeMode)
Write the contents of the
PCollection to the given Target,
using the storage format specified by the target and the given
WriteMode for cases where the referenced Target
already exists. |
PTable<K,V> |
PTable.write(Target target)
Writes this
PTable to the given Target. |
PCollection<S> |
PCollection.write(Target target)
Write the contents of this
PCollection to the given Target,
using the storage format specified by the target. |
PTable<K,V> |
PTable.write(Target target,
Target.WriteMode writeMode)
Writes this
PTable to the given Target, using the
given Target.WriteMode to handle existing targets. |
PCollection<S> |
PCollection.write(Target target,
Target.WriteMode writeMode)
Write the contents of this
PCollection to the given Target,
using the given Target.WriteMode to handle existing
targets. |
| Modifier and Type | Method and Description |
|---|---|
ParallelDoOptions.Builder |
ParallelDoOptions.Builder.targets(Collection<Target> targets) |
| Modifier and Type | Method and Description |
|---|---|
void |
DistributedPipeline.write(PCollection<?> pcollection,
Target target) |
void |
DistributedPipeline.write(PCollection<?> pcollection,
Target target,
Target.WriteMode writeMode) |
| Modifier and Type | Method and Description |
|---|---|
Set<Target> |
PCollectionImpl.getTargetDependencies() |
Set<Target> |
BaseGroupedTable.getTargetDependencies() |
| Modifier and Type | Method and Description |
|---|---|
PTable<K,V> |
PTableBase.write(Target target) |
PCollection<S> |
PCollectionImpl.write(Target target) |
PTable<K,V> |
PTableBase.write(Target target,
Target.WriteMode writeMode) |
PCollection<S> |
PCollectionImpl.write(Target target,
Target.WriteMode writeMode) |
| Modifier and Type | Method and Description |
|---|---|
void |
MemPipeline.write(PCollection<?> collection,
Target target) |
void |
MemPipeline.write(PCollection<?> collection,
Target target,
Target.WriteMode writeMode) |
| Constructor and Description |
|---|
SparkRuntime(SparkPipeline pipeline,
org.apache.spark.api.java.JavaSparkContext sparkContext,
org.apache.hadoop.conf.Configuration conf,
Map<PCollectionImpl<?>,Set<Target>> outputTargets,
Map<PCollectionImpl<?>,org.apache.crunch.materialize.MaterializableIterable> toMaterialize,
Map<PCollection<?>,org.apache.spark.storage.StorageLevel> toCache,
Map<PipelineCallable<?>,Set<Target>> allPipelineCallables) |
SparkRuntime(SparkPipeline pipeline,
org.apache.spark.api.java.JavaSparkContext sparkContext,
org.apache.hadoop.conf.Configuration conf,
Map<PCollectionImpl<?>,Set<Target>> outputTargets,
Map<PCollectionImpl<?>,org.apache.crunch.materialize.MaterializableIterable> toMaterialize,
Map<PCollection<?>,org.apache.spark.storage.StorageLevel> toCache,
Map<PipelineCallable<?>,Set<Target>> allPipelineCallables) |
| Modifier and Type | Interface and Description |
|---|---|
interface |
MapReduceTarget |
interface |
PathTarget
A target whose output goes to a given path on a file system.
|
interface |
ReadableSourceTarget<T>
An interface that indicates that a
SourceTarget instance can be read
into the local client. |
| Modifier and Type | Method and Description |
|---|---|
static <T extends Target> |
Compress.compress(T target,
Class<? extends org.apache.hadoop.io.compress.CompressionCodec> codecClass)
Configure the given output target to be compressed using the given codec.
|
static <T extends Target> |
Compress.gzip(T target)
Configure the given output target to be compressed using Gzip.
|
static <T extends Target> |
Compress.snappy(T target)
Configure the given output target to be compressed using Snappy.
|
| Modifier and Type | Method and Description |
|---|---|
static Target |
To.avroFile(org.apache.hadoop.fs.Path path)
Creates a
Target at the given Path that writes data to
Avro files. |
static Target |
To.avroFile(String pathName)
Creates a
Target at the given path name that writes data to
Avro files. |
static <K extends org.apache.hadoop.io.Writable,V extends org.apache.hadoop.io.Writable> |
To.formattedFile(org.apache.hadoop.fs.Path path,
Class<? extends org.apache.hadoop.mapreduce.lib.output.FileOutputFormat<K,V>> formatClass)
Creates a
Target at the given Path that writes data to
a custom FileOutputFormat. |
static <K extends org.apache.hadoop.io.Writable,V extends org.apache.hadoop.io.Writable> |
To.formattedFile(String pathName,
Class<? extends org.apache.hadoop.mapreduce.lib.output.FileOutputFormat<K,V>> formatClass)
Creates a
Target at the given path name that writes data to
a custom FileOutputFormat. |
static Target |
To.sequenceFile(org.apache.hadoop.fs.Path path)
Creates a
Target at the given Path that writes data to
SequenceFiles. |
static Target |
To.sequenceFile(String pathName)
Creates a
Target at the given path name that writes data to
SequenceFiles. |
static Target |
To.textFile(org.apache.hadoop.fs.Path path)
Creates a
Target at the given Path that writes data to
text files. |
static Target |
To.textFile(String pathName)
Creates a
Target at the given path name that writes data to
text files. |
| Modifier and Type | Method and Description |
|---|---|
boolean |
OutputHandler.configure(Target target,
PType<?> ptype) |
| Modifier and Type | Method and Description |
|---|---|
void |
CrunchTool.write(PCollection<?> pcollection,
Target target) |
Copyright © 2015 The Apache Software Foundation. All Rights Reserved.