Package | Description |
---|---|
org.apache.crunch |
Client-facing API and core abstractions.
|
org.apache.crunch.impl.dist | |
org.apache.crunch.impl.dist.collect | |
org.apache.crunch.impl.mem |
In-memory Pipeline implementation for rapid prototyping and testing.
|
org.apache.crunch.impl.spark | |
org.apache.crunch.io |
Data input and output for Pipelines.
|
org.apache.crunch.lambda |
Alternative Crunch API using Java 8 features to allow construction of pipelines using lambda functions and method
references.
|
org.apache.crunch.util |
An assorted set of utilities.
|
Modifier and Type | Interface and Description |
---|---|
interface |
SourceTarget<T>
An interface for classes that implement both the
Source and the
Target interfaces. |
interface |
TableSourceTarget<K,V>
An interface for classes that implement both the
TableSource and the
Target interfaces. |
Modifier and Type | Method and Description |
---|---|
Target |
Target.outputConf(String key,
String value)
Adds the given key-value pair to the
Configuration instance that is used to write
this Target . |
Modifier and Type | Method and Description |
---|---|
Map<String,Target> |
PipelineCallable.getAllTargets()
Returns the mapping of labels to Target dependencies for this instance.
|
Set<Target> |
ParallelDoOptions.getTargets() |
Modifier and Type | Method and Description |
---|---|
PipelineCallable<Output> |
PipelineCallable.dependsOn(String label,
Target t)
Requires that the given
Target exists before this instance may be
executed. |
ParallelDoOptions.Builder |
ParallelDoOptions.Builder.targets(Target... targets) |
void |
Pipeline.write(PCollection<?> collection,
Target target)
Write the given collection to the given target on the next pipeline run.
|
void |
Pipeline.write(PCollection<?> collection,
Target target,
Target.WriteMode writeMode)
Write the contents of the
PCollection to the given Target ,
using the storage format specified by the target and the given
WriteMode for cases where the referenced Target
already exists. |
PTable<K,V> |
PTable.write(Target target)
Writes this
PTable to the given Target . |
PCollection<S> |
PCollection.write(Target target)
Write the contents of this
PCollection to the given Target ,
using the storage format specified by the target. |
PTable<K,V> |
PTable.write(Target target,
Target.WriteMode writeMode)
Writes this
PTable to the given Target , using the
given Target.WriteMode to handle existing targets. |
PCollection<S> |
PCollection.write(Target target,
Target.WriteMode writeMode)
Write the contents of this
PCollection to the given Target ,
using the given Target.WriteMode to handle existing
targets. |
Modifier and Type | Method and Description |
---|---|
ParallelDoOptions.Builder |
ParallelDoOptions.Builder.targets(Collection<Target> targets) |
Modifier and Type | Method and Description |
---|---|
void |
DistributedPipeline.write(PCollection<?> pcollection,
Target target) |
void |
DistributedPipeline.write(PCollection<?> pcollection,
Target target,
Target.WriteMode writeMode) |
Modifier and Type | Method and Description |
---|---|
Set<Target> |
PCollectionImpl.getTargetDependencies() |
Set<Target> |
BaseGroupedTable.getTargetDependencies() |
Modifier and Type | Method and Description |
---|---|
PTable<K,V> |
PTableBase.write(Target target) |
PCollection<S> |
PCollectionImpl.write(Target target) |
PTable<K,V> |
PTableBase.write(Target target,
Target.WriteMode writeMode) |
PCollection<S> |
PCollectionImpl.write(Target target,
Target.WriteMode writeMode) |
Modifier and Type | Method and Description |
---|---|
void |
MemPipeline.write(PCollection<?> collection,
Target target) |
void |
MemPipeline.write(PCollection<?> collection,
Target target,
Target.WriteMode writeMode) |
Constructor and Description |
---|
SparkRuntime(SparkPipeline pipeline,
org.apache.spark.api.java.JavaSparkContext sparkContext,
org.apache.hadoop.conf.Configuration conf,
Map<PCollectionImpl<?>,Set<Target>> outputTargets,
Map<PCollectionImpl<?>,org.apache.crunch.materialize.MaterializableIterable> toMaterialize,
Map<PCollection<?>,org.apache.spark.storage.StorageLevel> toCache,
Map<PipelineCallable<?>,Set<Target>> allPipelineCallables) |
SparkRuntime(SparkPipeline pipeline,
org.apache.spark.api.java.JavaSparkContext sparkContext,
org.apache.hadoop.conf.Configuration conf,
Map<PCollectionImpl<?>,Set<Target>> outputTargets,
Map<PCollectionImpl<?>,org.apache.crunch.materialize.MaterializableIterable> toMaterialize,
Map<PCollection<?>,org.apache.spark.storage.StorageLevel> toCache,
Map<PipelineCallable<?>,Set<Target>> allPipelineCallables) |
Modifier and Type | Interface and Description |
---|---|
interface |
MapReduceTarget |
interface |
PathTarget
A target whose output goes to a given path on a file system.
|
interface |
ReadableSourceTarget<T>
An interface that indicates that a
SourceTarget instance can be read
into the local client. |
Modifier and Type | Method and Description |
---|---|
static <T extends Target> |
Compress.compress(T target,
Class<? extends org.apache.hadoop.io.compress.CompressionCodec> codecClass)
Configure the given output target to be compressed using the given codec.
|
static <T extends Target> |
Compress.gzip(T target)
Configure the given output target to be compressed using Gzip.
|
static <T extends Target> |
Compress.snappy(T target)
Configure the given output target to be compressed using Snappy.
|
Modifier and Type | Method and Description |
---|---|
static Target |
To.avroFile(org.apache.hadoop.fs.Path path)
Creates a
Target at the given Path that writes data to
Avro files. |
static Target |
To.avroFile(String pathName)
Creates a
Target at the given path name that writes data to
Avro files. |
static <K extends org.apache.hadoop.io.Writable,V extends org.apache.hadoop.io.Writable> |
To.formattedFile(org.apache.hadoop.fs.Path path,
Class<? extends org.apache.hadoop.mapreduce.lib.output.FileOutputFormat<K,V>> formatClass)
Creates a
Target at the given Path that writes data to
a custom FileOutputFormat . |
static <K extends org.apache.hadoop.io.Writable,V extends org.apache.hadoop.io.Writable> |
To.formattedFile(String pathName,
Class<? extends org.apache.hadoop.mapreduce.lib.output.FileOutputFormat<K,V>> formatClass)
Creates a
Target at the given path name that writes data to
a custom FileOutputFormat . |
static Target |
To.sequenceFile(org.apache.hadoop.fs.Path path)
Creates a
Target at the given Path that writes data to
SequenceFiles. |
static Target |
To.sequenceFile(String pathName)
Creates a
Target at the given path name that writes data to
SequenceFiles. |
static Target |
To.textFile(org.apache.hadoop.fs.Path path)
Creates a
Target at the given Path that writes data to
text files. |
static Target |
To.textFile(String pathName)
Creates a
Target at the given path name that writes data to
text files. |
Modifier and Type | Method and Description |
---|---|
boolean |
OutputHandler.configure(Target target,
PType<?> ptype) |
Modifier and Type | Method and Description |
---|---|
default LTable<K,V> |
LTable.write(Target target)
Write this table to the
Target supplied. |
default LCollection<S> |
LCollection.write(Target target)
Write this collection to the specified
Target |
default LTable<K,V> |
LTable.write(Target target,
Target.WriteMode writeMode)
Write this table to the
Target supplied. |
default LCollection<S> |
LCollection.write(Target target,
Target.WriteMode writeMode)
Write this collection to the specified
Target with the given Target.WriteMode |
Modifier and Type | Method and Description |
---|---|
void |
CrunchTool.write(PCollection<?> pcollection,
Target target) |
Copyright © 2016 The Apache Software Foundation. All rights reserved.