This project has retired. For details please refer to its
Attic page .
AvroParquetFileSource (Apache Crunch 0.9.0 API)
org.apache.crunch.io.parquet
Class AvroParquetFileSource<T extends org.apache.avro.generic.IndexedRecord>
java.lang.Object
org.apache.crunch.io.impl.FileSourceImpl <T>
org.apache.crunch.io.parquet.AvroParquetFileSource<T>
All Implemented Interfaces: ReadableSource <T>, Source <T>
public class AvroParquetFileSource<T extends org.apache.avro.generic.IndexedRecord> extends FileSourceImpl <T>implements ReadableSource <T>
Nested Class Summary
static class
AvroParquetFileSource.Builder <T extends org.apache.avro.generic.IndexedRecord>
Helper class for constructing an AvroParquetFileSource
that only reads a subset of the
fields defined in an Avro schema.
Constructor Summary
AvroParquetFileSource (List <org.apache.hadoop.fs.Path> paths,
AvroType <T > ptype)
AvroParquetFileSource (List <org.apache.hadoop.fs.Path> paths,
AvroType <T > ptype,
Class <? extends parquet.filter.UnboundRecordFilter> filterClass)
AvroParquetFileSource (List <org.apache.hadoop.fs.Path> paths,
AvroType <T > ptype,
org.apache.avro.Schema schema)
AvroParquetFileSource (List <org.apache.hadoop.fs.Path> paths,
AvroType <T > ptype,
org.apache.avro.Schema schema,
Class <? extends parquet.filter.UnboundRecordFilter> filterClass)
AvroParquetFileSource (org.apache.hadoop.fs.Path path,
AvroType <T > ptype)
AvroParquetFileSource (org.apache.hadoop.fs.Path path,
AvroType <T > ptype,
org.apache.avro.Schema schema)
Methods inherited from class org.apache.crunch.io.impl.FileSourceImpl
configureSource , equals , getBundle , getLastModifiedAt , getPath , getPaths , getSize , getType , hashCode , inputConf , pathsAsString , read
AvroParquetFileSource
public AvroParquetFileSource (org.apache.hadoop.fs.Path path,
AvroType <T > ptype)
AvroParquetFileSource
public AvroParquetFileSource (org.apache.hadoop.fs.Path path,
AvroType <T > ptype,
org.apache.avro.Schema schema)
AvroParquetFileSource
public AvroParquetFileSource (List <org.apache.hadoop.fs.Path> paths,
AvroType <T > ptype)
AvroParquetFileSource
public AvroParquetFileSource (List <org.apache.hadoop.fs.Path> paths,
AvroType <T > ptype,
org.apache.avro.Schema schema)
AvroParquetFileSource
public AvroParquetFileSource (List <org.apache.hadoop.fs.Path> paths,
AvroType <T > ptype,
Class <? extends parquet.filter.UnboundRecordFilter> filterClass)
AvroParquetFileSource
public AvroParquetFileSource (List <org.apache.hadoop.fs.Path> paths,
AvroType <T > ptype,
org.apache.avro.Schema schema,
Class <? extends parquet.filter.UnboundRecordFilter> filterClass)
getProjectedSchema
public org.apache.avro.Schema getProjectedSchema ()
read
public Iterable <T > read (org.apache.hadoop.conf.Configuration conf)
throws IOException
Description copied from interface: ReadableSource
Returns an Iterable
that contains the contents of this source.
Specified by: read
in interface ReadableSource <T extends org.apache.avro.generic.IndexedRecord>
Parameters: conf
- The current Configuration
instance
Returns: the contents of this Source
as an Iterable
instance
Throws:
IOException
asReadable
public ReadableData <T > asReadable ()
Specified by: asReadable
in interface ReadableSource <T extends org.apache.avro.generic.IndexedRecord>
Returns: a ReadableData
instance containing the data referenced by this
ReadableSource
.
getFileReaderFactory
protected org.apache.crunch.io.parquet.AvroParquetFileReaderFactory<T > getFileReaderFactory (AvroType <T > ptype)
getConverter
public Converter <?,?,?,?> getConverter ()
Description copied from interface: Source
Returns the Converter
used for mapping the inputs from this instance
into PCollection
or PTable
values.
Specified by: getConverter
in interface Source <T extends org.apache.avro.generic.IndexedRecord>
Overrides: getConverter
in class FileSourceImpl <T extends org.apache.avro.generic.IndexedRecord>
toString
public String toString ()
Overrides: toString
in class FileSourceImpl <T extends org.apache.avro.generic.IndexedRecord>
builder
public static <T extends org.apache.avro.specific.SpecificRecord> AvroParquetFileSource.Builder <T> builder (Class <T> clazz)
builder
public static AvroParquetFileSource.Builder <org.apache.avro.generic.GenericRecord> builder (org.apache.avro.Schema schema)
Copyright © 2014 The Apache Software Foundation . All Rights Reserved.