Package | Description |
---|---|
org.apache.crunch |
Client-facing API and core abstractions.
|
org.apache.crunch.contrib.io.jdbc |
Support for reading data from RDBMS using JDBC
|
org.apache.crunch.impl.dist | |
org.apache.crunch.impl.dist.collect | |
org.apache.crunch.impl.mem |
In-memory Pipeline implementation for rapid prototyping and testing.
|
org.apache.crunch.impl.spark.collect | |
org.apache.crunch.io |
Data input and output for Pipelines.
|
org.apache.crunch.io.impl | |
org.apache.crunch.util |
An assorted set of utilities.
|
Modifier and Type | Interface and Description |
---|---|
interface |
SourceTarget<T>
An interface for classes that implement both the
Source and the
Target interfaces. |
interface |
TableSource<K,V>
The interface
Source implementations that return a PTable . |
interface |
TableSourceTarget<K,V>
An interface for classes that implement both the
TableSource and the
Target interfaces. |
Modifier and Type | Method and Description |
---|---|
Source<T> |
Source.inputConf(String key,
String value)
Adds the given key-value pair to the
Configuration instance that is used to read
this Source<T></T> . |
Modifier and Type | Method and Description |
---|---|
<T> PCollection<T> |
Pipeline.read(Source<T> source)
Converts the given
Source into a PCollection that is
available to jobs run using this Pipeline instance. |
<T> PCollection<T> |
Pipeline.read(Source<T> source,
String named)
Converts the given
Source into a PCollection that is
available to jobs run using this Pipeline instance. |
ParallelDoOptions.Builder |
ParallelDoOptions.Builder.sources(Source<?>... sources) |
Modifier and Type | Method and Description |
---|---|
ParallelDoOptions.Builder |
ParallelDoOptions.Builder.sources(Collection<Source<?>> sources) |
Modifier and Type | Class and Description |
---|---|
class |
DataBaseSource<T extends org.apache.hadoop.mapreduce.lib.db.DBWritable & org.apache.hadoop.io.Writable>
Source from reading from a database via a JDBC connection.
|
Modifier and Type | Method and Description |
---|---|
<S> PCollection<S> |
DistributedPipeline.read(Source<S> source) |
<S> PCollection<S> |
DistributedPipeline.read(Source<S> source,
String named) |
Modifier and Type | Method and Description |
---|---|
Source<S> |
BaseInputCollection.getSource() |
Modifier and Type | Method and Description |
---|---|
<S> BaseInputCollection<S> |
PCollectionFactory.createInputCollection(Source<S> source,
String named,
DistributedPipeline distributedPipeline,
ParallelDoOptions doOpts) |
Constructor and Description |
---|
BaseInputCollection(Source<S> source,
DistributedPipeline pipeline) |
BaseInputCollection(Source<S> source,
String name,
DistributedPipeline pipeline,
ParallelDoOptions doOpts) |
Modifier and Type | Method and Description |
---|---|
<T> PCollection<T> |
MemPipeline.read(Source<T> source) |
<T> PCollection<T> |
MemPipeline.read(Source<T> source,
String named) |
Modifier and Type | Method and Description |
---|---|
<S> BaseInputCollection<S> |
SparkCollectFactory.createInputCollection(Source<S> source,
String named,
DistributedPipeline pipeline,
ParallelDoOptions doOpts) |
Modifier and Type | Interface and Description |
---|---|
interface |
ReadableSource<T>
An extension of the
Source interface that indicates that a
Source instance may be read as a series of records by the client
code. |
interface |
ReadableSourceTarget<T>
An interface that indicates that a
SourceTarget instance can be read
into the local client. |
Modifier and Type | Method and Description |
---|---|
static Source<org.apache.avro.generic.GenericData.Record> |
From.avroFile(List<org.apache.hadoop.fs.Path> paths)
Creates a
Source<GenericData.Record> by reading the schema of the Avro file
at the given paths. |
static <T extends org.apache.avro.specific.SpecificRecord> |
From.avroFile(List<org.apache.hadoop.fs.Path> paths,
Class<T> avroClass)
Creates a
Source<T> instance from the Avro file(s) at the given Path s. |
static Source<org.apache.avro.generic.GenericData.Record> |
From.avroFile(List<org.apache.hadoop.fs.Path> paths,
org.apache.hadoop.conf.Configuration conf)
Creates a
Source<GenericData.Record> by reading the schema of the Avro file
at the given paths using the FileSystem information contained in the given
Configuration instance. |
static <T> Source<T> |
From.avroFile(List<org.apache.hadoop.fs.Path> paths,
PType<T> ptype)
Creates a
Source<T> instance from the Avro file(s) at the given Path s. |
static Source<org.apache.avro.generic.GenericData.Record> |
From.avroFile(org.apache.hadoop.fs.Path path)
Creates a
Source<GenericData.Record> by reading the schema of the Avro file
at the given path. |
static <T extends org.apache.avro.specific.SpecificRecord> |
From.avroFile(org.apache.hadoop.fs.Path path,
Class<T> avroClass)
Creates a
Source<T> instance from the Avro file(s) at the given Path . |
static Source<org.apache.avro.generic.GenericData.Record> |
From.avroFile(org.apache.hadoop.fs.Path path,
org.apache.hadoop.conf.Configuration conf)
Creates a
Source<GenericData.Record> by reading the schema of the Avro file
at the given path using the FileSystem information contained in the given
Configuration instance. |
static <T> Source<T> |
From.avroFile(org.apache.hadoop.fs.Path path,
PType<T> ptype)
Creates a
Source<T> instance from the Avro file(s) at the given Path . |
static Source<org.apache.avro.generic.GenericData.Record> |
From.avroFile(String pathName)
Creates a
Source<GenericData.Record> by reading the schema of the Avro file
at the given path. |
static <T extends org.apache.avro.specific.SpecificRecord> |
From.avroFile(String pathName,
Class<T> avroClass)
Creates a
Source<T> instance from the Avro file(s) at the given path name. |
static <T> Source<T> |
From.avroFile(String pathName,
PType<T> ptype)
Creates a
Source<T> instance from the Avro file(s) at the given path name. |
static <T extends org.apache.hadoop.io.Writable> |
From.sequenceFile(List<org.apache.hadoop.fs.Path> paths,
Class<T> valueClass)
Creates a
Source<T> instance from the SequenceFile(s) at the given Path s
from the value field of each key-value pair in the SequenceFile(s). |
static <T> Source<T> |
From.sequenceFile(List<org.apache.hadoop.fs.Path> paths,
PType<T> ptype)
Creates a
Source<T> instance from the SequenceFile(s) at the given Path s
from the value field of each key-value pair in the SequenceFile(s). |
static <T extends org.apache.hadoop.io.Writable> |
From.sequenceFile(org.apache.hadoop.fs.Path path,
Class<T> valueClass)
Creates a
Source<T> instance from the SequenceFile(s) at the given Path
from the value field of each key-value pair in the SequenceFile(s). |
static <T> Source<T> |
From.sequenceFile(org.apache.hadoop.fs.Path path,
PType<T> ptype)
Creates a
Source<T> instance from the SequenceFile(s) at the given Path
from the value field of each key-value pair in the SequenceFile(s). |
static <T extends org.apache.hadoop.io.Writable> |
From.sequenceFile(String pathName,
Class<T> valueClass)
Creates a
Source<T> instance from the SequenceFile(s) at the given path name
from the value field of each key-value pair in the SequenceFile(s). |
static <T> Source<T> |
From.sequenceFile(String pathName,
PType<T> ptype)
Creates a
Source<T> instance from the SequenceFile(s) at the given path name
from the value field of each key-value pair in the SequenceFile(s). |
static Source<String> |
From.textFile(List<org.apache.hadoop.fs.Path> paths)
Creates a
Source<String> instance for the text file(s) at the given Path s. |
static <T> Source<T> |
From.textFile(List<org.apache.hadoop.fs.Path> paths,
PType<T> ptype)
Creates a
Source<T> instance for the text file(s) at the given Path s using
the provided PType<T> to convert the input text. |
static Source<String> |
From.textFile(org.apache.hadoop.fs.Path path)
Creates a
Source<String> instance for the text file(s) at the given Path . |
static <T> Source<T> |
From.textFile(org.apache.hadoop.fs.Path path,
PType<T> ptype)
Creates a
Source<T> instance for the text file(s) at the given Path using
the provided PType<T> to convert the input text. |
static Source<String> |
From.textFile(String pathName)
Creates a
Source<String> instance for the text file(s) at the given path name. |
static <T> Source<T> |
From.textFile(String pathName,
PType<T> ptype)
Creates a
Source<T> instance for the text file(s) at the given path name using
the provided PType<T> to convert the input text. |
Modifier and Type | Class and Description |
---|---|
class |
org.apache.crunch.io.impl.FileSourceImpl<T> |
Modifier and Type | Method and Description |
---|---|
<T> PCollection<T> |
CrunchTool.read(Source<T> source) |
Copyright © 2016 The Apache Software Foundation. All rights reserved.