public class AvroPathPerKeyOutputFormat<T>
extends org.apache.hadoop.mapreduce.lib.output.FileOutputFormat<org.apache.avro.mapred.AvroWrapper<org.apache.avro.mapred.Pair<org.apache.avro.util.Utf8,T>>,org.apache.hadoop.io.NullWritable>
FileOutputFormat
that takes in a Utf8
and an Avro record and writes the Avro records to
a sub-directory of the output path whose name is equal to the string-form of the Utf8
.
This OutputFormat
only keeps one RecordWriter
open at a time, so it's a very good idea to write
out all of the records for the same key at the same time within each partition so as not to be frequently opening
and closing files.Constructor and Description |
---|
AvroPathPerKeyOutputFormat() |
Modifier and Type | Method and Description |
---|---|
org.apache.hadoop.mapreduce.RecordWriter<org.apache.avro.mapred.AvroWrapper<org.apache.avro.mapred.Pair<org.apache.avro.util.Utf8,T>>,org.apache.hadoop.io.NullWritable> |
getRecordWriter(org.apache.hadoop.mapreduce.TaskAttemptContext taskAttemptContext) |
checkOutputSpecs, getCompressOutput, getDefaultWorkFile, getOutputCommitter, getOutputCompressorClass, getOutputPath, getPathForWorkFile, getUniqueFile, getWorkOutputPath, setCompressOutput, setOutputCompressorClass, setOutputPath
public org.apache.hadoop.mapreduce.RecordWriter<org.apache.avro.mapred.AvroWrapper<org.apache.avro.mapred.Pair<org.apache.avro.util.Utf8,T>>,org.apache.hadoop.io.NullWritable> getRecordWriter(org.apache.hadoop.mapreduce.TaskAttemptContext taskAttemptContext) throws IOException, InterruptedException
getRecordWriter
in class org.apache.hadoop.mapreduce.lib.output.FileOutputFormat<org.apache.avro.mapred.AvroWrapper<org.apache.avro.mapred.Pair<org.apache.avro.util.Utf8,T>>,org.apache.hadoop.io.NullWritable>
IOException
InterruptedException
Copyright © 2017 The Apache Software Foundation. All rights reserved.