| Package | Description | 
|---|---|
| org.apache.crunch | Client-facing API and core abstractions. | 
| org.apache.crunch.contrib.io.jdbc | Support for reading data from RDBMS using JDBC | 
| org.apache.crunch.impl.dist | |
| org.apache.crunch.impl.dist.collect | |
| org.apache.crunch.impl.mem | In-memory Pipeline implementation for rapid prototyping and testing. | 
| org.apache.crunch.impl.spark.collect | |
| org.apache.crunch.io | Data input and output for Pipelines. | 
| org.apache.crunch.io.impl | |
| org.apache.crunch.kafka | |
| org.apache.crunch.util | An assorted set of utilities. | 
| Modifier and Type | Interface and Description | 
|---|---|
| interface  | SourceTarget<T>An interface for classes that implement both the  Sourceand theTargetinterfaces. | 
| interface  | TableSource<K,V>The interface  Sourceimplementations that return aPTable. | 
| interface  | TableSourceTarget<K,V>An interface for classes that implement both the  TableSourceand theTargetinterfaces. | 
| Modifier and Type | Method and Description | 
|---|---|
| Source<T> | Source. inputConf(String key,
         String value)Adds the given key-value pair to the  Configurationinstance that is used to read
 thisSource<T></T>. | 
| Modifier and Type | Method and Description | 
|---|---|
| <T> PCollection<T> | Pipeline. read(Source<T> source)Converts the given  Sourceinto aPCollectionthat is
 available to jobs run using thisPipelineinstance. | 
| <T> PCollection<T> | Pipeline. read(Source<T> source,
    String named)Converts the given  Sourceinto aPCollectionthat is
 available to jobs run using thisPipelineinstance. | 
| ParallelDoOptions.Builder | ParallelDoOptions.Builder. sources(Source<?>... sources) | 
| Modifier and Type | Method and Description | 
|---|---|
| ParallelDoOptions.Builder | ParallelDoOptions.Builder. sources(Collection<Source<?>> sources) | 
| Modifier and Type | Class and Description | 
|---|---|
| class  | DataBaseSource<T extends org.apache.hadoop.mapreduce.lib.db.DBWritable & org.apache.hadoop.io.Writable>Source from reading from a database via a JDBC connection. | 
| Modifier and Type | Method and Description | 
|---|---|
| <S> PCollection<S> | DistributedPipeline. read(Source<S> source) | 
| <S> PCollection<S> | DistributedPipeline. read(Source<S> source,
    String named) | 
| Modifier and Type | Method and Description | 
|---|---|
| Source<S> | BaseInputCollection. getSource() | 
| Modifier and Type | Method and Description | 
|---|---|
| <S> BaseInputCollection<S> | PCollectionFactory. createInputCollection(Source<S> source,
                     String named,
                     DistributedPipeline distributedPipeline,
                     ParallelDoOptions doOpts) | 
| Constructor and Description | 
|---|
| BaseInputCollection(Source<S> source,
                   DistributedPipeline pipeline) | 
| BaseInputCollection(Source<S> source,
                   String name,
                   DistributedPipeline pipeline,
                   ParallelDoOptions doOpts) | 
| Modifier and Type | Method and Description | 
|---|---|
| <T> PCollection<T> | MemPipeline. read(Source<T> source) | 
| <T> PCollection<T> | MemPipeline. read(Source<T> source,
    String named) | 
| Modifier and Type | Method and Description | 
|---|---|
| <S> BaseInputCollection<S> | SparkCollectFactory. createInputCollection(Source<S> source,
                     String named,
                     DistributedPipeline pipeline,
                     ParallelDoOptions doOpts) | 
| Modifier and Type | Interface and Description | 
|---|---|
| interface  | ReadableSource<T>An extension of the  Sourceinterface that indicates that aSourceinstance may be read as a series of records by the client
 code. | 
| interface  | ReadableSourceTarget<T>An interface that indicates that a  SourceTargetinstance can be read
 into the local client. | 
| Modifier and Type | Method and Description | 
|---|---|
| static Source<org.apache.avro.generic.GenericData.Record> | From. avroFile(List<org.apache.hadoop.fs.Path> paths)Creates a  Source<GenericData.Record>by reading the schema of the Avro file
 at the given paths. | 
| static <T extends org.apache.avro.specific.SpecificRecord> | From. avroFile(List<org.apache.hadoop.fs.Path> paths,
        Class<T> avroClass)Creates a  Source<T>instance from the Avro file(s) at the givenPaths. | 
| static Source<org.apache.avro.generic.GenericData.Record> | From. avroFile(List<org.apache.hadoop.fs.Path> paths,
        org.apache.hadoop.conf.Configuration conf)Creates a  Source<GenericData.Record>by reading the schema of the Avro file
 at the given paths using theFileSysteminformation contained in the givenConfigurationinstance. | 
| static <T> Source<T> | From. avroFile(List<org.apache.hadoop.fs.Path> paths,
        PType<T> ptype)Creates a  Source<T>instance from the Avro file(s) at the givenPaths. | 
| static Source<org.apache.avro.generic.GenericData.Record> | From. avroFile(org.apache.hadoop.fs.Path path)Creates a  Source<GenericData.Record>by reading the schema of the Avro file
 at the given path. | 
| static <T extends org.apache.avro.specific.SpecificRecord> | From. avroFile(org.apache.hadoop.fs.Path path,
        Class<T> avroClass)Creates a  Source<T>instance from the Avro file(s) at the givenPath. | 
| static Source<org.apache.avro.generic.GenericData.Record> | From. avroFile(org.apache.hadoop.fs.Path path,
        org.apache.hadoop.conf.Configuration conf)Creates a  Source<GenericData.Record>by reading the schema of the Avro file
 at the given path using theFileSysteminformation contained in the givenConfigurationinstance. | 
| static <T> Source<T> | From. avroFile(org.apache.hadoop.fs.Path path,
        PType<T> ptype)Creates a  Source<T>instance from the Avro file(s) at the givenPath. | 
| static Source<org.apache.avro.generic.GenericData.Record> | From. avroFile(String pathName)Creates a  Source<GenericData.Record>by reading the schema of the Avro file
 at the given path. | 
| static <T extends org.apache.avro.specific.SpecificRecord> | From. avroFile(String pathName,
        Class<T> avroClass)Creates a  Source<T>instance from the Avro file(s) at the given path name. | 
| static <T> Source<T> | From. avroFile(String pathName,
        PType<T> ptype)Creates a  Source<T>instance from the Avro file(s) at the given path name. | 
| static <T extends org.apache.hadoop.io.Writable> | From. sequenceFile(List<org.apache.hadoop.fs.Path> paths,
            Class<T> valueClass)Creates a  Source<T>instance from the SequenceFile(s) at the givenPaths
 from the value field of each key-value pair in the SequenceFile(s). | 
| static <T> Source<T> | From. sequenceFile(List<org.apache.hadoop.fs.Path> paths,
            PType<T> ptype)Creates a  Source<T>instance from the SequenceFile(s) at the givenPaths
 from the value field of each key-value pair in the SequenceFile(s). | 
| static <T extends org.apache.hadoop.io.Writable> | From. sequenceFile(org.apache.hadoop.fs.Path path,
            Class<T> valueClass)Creates a  Source<T>instance from the SequenceFile(s) at the givenPathfrom the value field of each key-value pair in the SequenceFile(s). | 
| static <T> Source<T> | From. sequenceFile(org.apache.hadoop.fs.Path path,
            PType<T> ptype)Creates a  Source<T>instance from the SequenceFile(s) at the givenPathfrom the value field of each key-value pair in the SequenceFile(s). | 
| static <T extends org.apache.hadoop.io.Writable> | From. sequenceFile(String pathName,
            Class<T> valueClass)Creates a  Source<T>instance from the SequenceFile(s) at the given path name
 from the value field of each key-value pair in the SequenceFile(s). | 
| static <T> Source<T> | From. sequenceFile(String pathName,
            PType<T> ptype)Creates a  Source<T>instance from the SequenceFile(s) at the given path name
 from the value field of each key-value pair in the SequenceFile(s). | 
| static Source<String> | From. textFile(List<org.apache.hadoop.fs.Path> paths)Creates a  Source<String>instance for the text file(s) at the givenPaths. | 
| static <T> Source<T> | From. textFile(List<org.apache.hadoop.fs.Path> paths,
        PType<T> ptype)Creates a  Source<T>instance for the text file(s) at the givenPaths using
 the providedPType<T>to convert the input text. | 
| static Source<String> | From. textFile(org.apache.hadoop.fs.Path path)Creates a  Source<String>instance for the text file(s) at the givenPath. | 
| static <T> Source<T> | From. textFile(org.apache.hadoop.fs.Path path,
        PType<T> ptype)Creates a  Source<T>instance for the text file(s) at the givenPathusing
 the providedPType<T>to convert the input text. | 
| static Source<String> | From. textFile(String pathName)Creates a  Source<String>instance for the text file(s) at the given path name. | 
| static <T> Source<T> | From. textFile(String pathName,
        PType<T> ptype)Creates a  Source<T>instance for the text file(s) at the given path name using
 the providedPType<T>to convert the input text. | 
| Modifier and Type | Class and Description | 
|---|---|
| class  | org.apache.crunch.io.impl.FileSourceImpl<T> | 
| Modifier and Type | Class and Description | 
|---|---|
| class  | KafkaSourceA Crunch Source that will retrieve events from Kafka given start and end offsets. | 
| Modifier and Type | Method and Description | 
|---|---|
| Source<Pair<org.apache.hadoop.io.BytesWritable,org.apache.hadoop.io.BytesWritable>> | KafkaSource. inputConf(String key,
         String value) | 
| Modifier and Type | Method and Description | 
|---|---|
| <T> PCollection<T> | CrunchTool. read(Source<T> source) | 
Copyright © 2017 The Apache Software Foundation. All rights reserved.