This project has retired. For details please refer to its Attic page.
AvroPathPerKeyOutputFormat (Apache Crunch 0.10.0 API)

org.apache.crunch.types.avro
Class AvroPathPerKeyOutputFormat<T>

java.lang.Object
  extended by org.apache.hadoop.mapreduce.OutputFormat<K,V>
      extended by org.apache.hadoop.mapreduce.lib.output.FileOutputFormat<org.apache.avro.mapred.AvroWrapper<org.apache.avro.mapred.Pair<org.apache.avro.util.Utf8,T>>,org.apache.hadoop.io.NullWritable>
          extended by org.apache.crunch.types.avro.AvroPathPerKeyOutputFormat<T>

public class AvroPathPerKeyOutputFormat<T>
extends org.apache.hadoop.mapreduce.lib.output.FileOutputFormat<org.apache.avro.mapred.AvroWrapper<org.apache.avro.mapred.Pair<org.apache.avro.util.Utf8,T>>,org.apache.hadoop.io.NullWritable>

A FileOutputFormat that takes in a Utf8 and an Avro record and writes the Avro records to a sub-directory of the output path whose name is equal to the string-form of the Utf8. This OutputFormat only keeps one RecordWriter open at a time, so it's a very good idea to write out all of the records for the same key at the same time within each partition so as not to be frequently opening and closing files.


Nested Class Summary
 
Nested classes/interfaces inherited from class org.apache.hadoop.mapreduce.lib.output.FileOutputFormat
org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.Counter
 
Constructor Summary
AvroPathPerKeyOutputFormat()
           
 
Method Summary
 org.apache.hadoop.mapreduce.RecordWriter<org.apache.avro.mapred.AvroWrapper<org.apache.avro.mapred.Pair<org.apache.avro.util.Utf8,T>>,org.apache.hadoop.io.NullWritable> getRecordWriter(org.apache.hadoop.mapreduce.TaskAttemptContext taskAttemptContext)
           
 
Methods inherited from class org.apache.hadoop.mapreduce.lib.output.FileOutputFormat
checkOutputSpecs, getCompressOutput, getDefaultWorkFile, getOutputCommitter, getOutputCompressorClass, getOutputPath, getPathForWorkFile, getUniqueFile, getWorkOutputPath, setCompressOutput, setOutputCompressorClass, setOutputPath
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

AvroPathPerKeyOutputFormat

public AvroPathPerKeyOutputFormat()
Method Detail

getRecordWriter

public org.apache.hadoop.mapreduce.RecordWriter<org.apache.avro.mapred.AvroWrapper<org.apache.avro.mapred.Pair<org.apache.avro.util.Utf8,T>>,org.apache.hadoop.io.NullWritable> getRecordWriter(org.apache.hadoop.mapreduce.TaskAttemptContext taskAttemptContext)
                                                                                                                                                                                         throws IOException,
                                                                                                                                                                                                InterruptedException
Specified by:
getRecordWriter in class org.apache.hadoop.mapreduce.lib.output.FileOutputFormat<org.apache.avro.mapred.AvroWrapper<org.apache.avro.mapred.Pair<org.apache.avro.util.Utf8,T>>,org.apache.hadoop.io.NullWritable>
Throws:
IOException
InterruptedException


Copyright © 2014 The Apache Software Foundation. All Rights Reserved.