Package | Description |
---|---|
org.apache.crunch |
Client-facing API and core abstractions.
|
org.apache.crunch.impl.dist | |
org.apache.crunch.impl.dist.collect | |
org.apache.crunch.impl.mem |
In-memory Pipeline implementation for rapid prototyping and testing.
|
org.apache.crunch.impl.spark |
Modifier and Type | Method and Description |
---|---|
PipelineCallable<Output> |
PipelineCallable.dependsOn(String label,
PCollection<?> pcollect)
Requires that the given
PCollection be materialized to disk before this instance may be
executed. |
PipelineCallable<Output> |
PipelineCallable.dependsOn(String label,
Target t)
Requires that the given
Target exists before this instance may be
executed. |
PipelineCallable<Output> |
PipelineCallable.named(String name)
Use the given name to identify this instance in the logs.
|
Modifier and Type | Method and Description |
---|---|
<Output> Output |
Pipeline.sequentialDo(PipelineCallable<Output> pipelineCallable)
Executes the given
PipelineCallable on the client after the Targets
that the PipelineCallable depends on (if any) have been created by other pipeline
processing steps. |
<Output> Output |
PCollection.sequentialDo(String label,
PipelineCallable<Output> pipelineCallable)
Adds the materialized data in this
PCollection as a dependency to the given
PipelineCallable and registers it with the Pipeline associated with this
instance. |
Modifier and Type | Method and Description |
---|---|
<Output> Output |
DistributedPipeline.sequentialDo(PipelineCallable<Output> pipelineCallable) |
Modifier and Type | Method and Description |
---|---|
<Output> Output |
PCollectionImpl.sequentialDo(String label,
PipelineCallable<Output> pipelineCallable) |
Modifier and Type | Method and Description |
---|---|
<Output> Output |
MemPipeline.sequentialDo(PipelineCallable<Output> callable) |
Constructor and Description |
---|
SparkRuntime(SparkPipeline pipeline,
org.apache.spark.api.java.JavaSparkContext sparkContext,
org.apache.hadoop.conf.Configuration conf,
Map<PCollectionImpl<?>,Set<Target>> outputTargets,
Map<PCollectionImpl<?>,org.apache.crunch.materialize.MaterializableIterable> toMaterialize,
Map<PCollection<?>,org.apache.spark.storage.StorageLevel> toCache,
Map<PipelineCallable<?>,Set<Target>> allPipelineCallables) |
Copyright © 2016 The Apache Software Foundation. All rights reserved.