FileTargetImpl (Apache Crunch 0.9.0 API)

This project has retired. For details please refer to its Attic page.

Overview

Package

Class

Use

Tree

Deprecated

Index

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

org.apache.crunch.io.impl
Class FileTargetImpl

java.lang.Object
  org.apache.crunch.io.impl.FileTargetImpl

All Implemented Interfaces:: MapReduceTarget, PathTarget, Target

Direct Known Subclasses:: AvroFileTarget, AvroParquetFileTarget, AvroPathPerKeyTarget, HFileTarget, SeqFileTarget, TextFileTarget, TrevniKeyTarget

public class FileTargetImpl
extends Object
implements PathTarget
extends Object
implements PathTarget

Nested Class Summary

Nested classes/interfaces inherited from interface org.apache.crunch.Target

Target.WriteMode

Field Summary

protected org.apache.hadoop.fs.Path path


Constructor Summary

FileTargetImpl(org.apache.hadoop.fs.Path path, Class<? extends org.apache.hadoop.mapreduce.lib.output.FileOutputFormat> outputFormatClass, FileNamingScheme fileNamingScheme)


FileTargetImpl(org.apache.hadoop.fs.Path path, Class<? extends org.apache.hadoop.mapreduce.lib.output.FileOutputFormat> outputFormatClass, FileNamingScheme fileNamingScheme, Map<String,String> extraConf)


Method Summary

boolean accept(OutputHandler handler, PType<?> ptype)
          Checks to see if this Target instance is compatible with the given PType.

<T> SourceTarget<T> asSourceTarget(PType<T> ptype)
          Attempt to create the SourceTarget type that corresponds to this Target for the given PType, if possible.

protected void configureForMapReduce(org.apache.hadoop.mapreduce.Job job, Class keyClass, Class valueClass, Class outputFormatClass, org.apache.hadoop.fs.Path outputPath, String name)
          Deprecated.

protected void configureForMapReduce(org.apache.hadoop.mapreduce.Job job, Class keyClass, Class valueClass, FormatBundle formatBundle, org.apache.hadoop.fs.Path outputPath, String name)


void configureForMapReduce(org.apache.hadoop.mapreduce.Job job, PType<?> ptype, org.apache.hadoop.fs.Path outputPath, String name)


boolean equals(Object other)


static int extractPartitionNumber(String reduceOutputFileName)
          Extract the partition number from a raw reducer output filename.

Converter<?,?,?,?> getConverter(PType<?> ptype)
          Returns the Converter to use for mapping from the output PCollection into the output values expected by this instance.

protected org.apache.hadoop.fs.Path getDestFile(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.Path src, org.apache.hadoop.fs.Path dir, boolean mapOnlyJob)


FileNamingScheme getFileNamingScheme()
          Get the naming scheme to be used for outputs being written to an output path.

org.apache.hadoop.fs.Path getPath()


protected org.apache.hadoop.fs.Path getSourcePattern(org.apache.hadoop.fs.Path workingPath, int index)


protected org.apache.hadoop.fs.Path getSuccessIndicator()


boolean handleExisting(Target.WriteMode strategy, long lastModForSource, org.apache.hadoop.conf.Configuration conf)
          Apply the given WriteMode to this Target instance.

void handleOutputs(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.Path workingPath, int index)
          Handles moving the output data for this target from a temporary location on the filesystem to its target path at the end of a MapReduce job.

int hashCode()


protected static boolean isCompatible(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path path)


Target outputConf(String key, String value)
          Adds the given key-value pair to the Configuration instance that is used to write this Target.

String toString()


Methods inherited from class java.lang.Object

clone, finalize, getClass, notify, notifyAll, wait, wait, wait

Nested classes/interfaces inherited from interface org.apache.crunch.Target
`Target.WriteMode`

Field Summary
`protected org.apache.hadoop.fs.Path`	`path`

Constructor Summary
`FileTargetImpl(org.apache.hadoop.fs.Path path, Class<? extends org.apache.hadoop.mapreduce.lib.output.FileOutputFormat> outputFormatClass, FileNamingScheme fileNamingScheme)`
`FileTargetImpl(org.apache.hadoop.fs.Path path, Class<? extends org.apache.hadoop.mapreduce.lib.output.FileOutputFormat> outputFormatClass, FileNamingScheme fileNamingScheme, Map<String,String> extraConf)`

Methods inherited from class java.lang.Object
`clone, finalize, getClass, notify, notifyAll, wait, wait, wait`

Field Detail

path

protected final org.apache.hadoop.fs.Path path

Constructor Detail

FileTargetImpl

public FileTargetImpl(org.apache.hadoop.fs.Path path,
                      Class<? extends org.apache.hadoop.mapreduce.lib.output.FileOutputFormat> outputFormatClass,
                      FileNamingScheme fileNamingScheme)

FileTargetImpl

public FileTargetImpl(org.apache.hadoop.fs.Path path,
                      Class<? extends org.apache.hadoop.mapreduce.lib.output.FileOutputFormat> outputFormatClass,
                      FileNamingScheme fileNamingScheme,
                      Map<String,String> extraConf)

Method Detail

outputConf

public Target outputConf(String key,
                         String value)

Description copied from interface: Target

Adds the given key-value pair to the Configuration instance that is used to write this Target. Allows for multiple target outputs to re-use the same config keys with different values when necessary.

Specified by:: outputConf in interface Target

configureForMapReduce

public void configureForMapReduce(org.apache.hadoop.mapreduce.Job job,
                                  PType<?> ptype,
                                  org.apache.hadoop.fs.Path outputPath,
                                  String name)

Specified by:: configureForMapReduce in interface MapReduceTarget

configureForMapReduce

@Deprecated
protected void configureForMapReduce(org.apache.hadoop.mapreduce.Job job,
                                                Class keyClass,
                                                Class valueClass,
                                                Class outputFormatClass,
                                                org.apache.hadoop.fs.Path outputPath,
                                                String name)

Deprecated.

configureForMapReduce

protected void configureForMapReduce(org.apache.hadoop.mapreduce.Job job,
                                     Class keyClass,
                                     Class valueClass,
                                     FormatBundle formatBundle,
                                     org.apache.hadoop.fs.Path outputPath,
                                     String name)

accept

public boolean accept(OutputHandler handler,
                      PType<?> ptype)

Description copied from interface: Target

Checks to see if this Target instance is compatible with the given PType.

Specified by:: accept in interface Target

Parameters:: handler - The OutputHandler that is managing the output for the job; ptype - The PType to check
Returns:: True if this Target can write data in the form of the given PType, false otherwise

getConverter

public Converter<?,?,?,?> getConverter(PType<?> ptype)

Description copied from interface: Target

Returns the Converter to use for mapping from the output PCollection into the output values expected by this instance.

Specified by:: getConverter in interface Target

Parameters:: ptype - The PType of the data that is being written to this instance
Returns:: A valid Converter for the output represented by this instance

handleOutputs

public void handleOutputs(org.apache.hadoop.conf.Configuration conf,
                          org.apache.hadoop.fs.Path workingPath,
                          int index)
                   throws IOException

Description copied from interface: PathTarget

Handles moving the output data for this target from a temporary location on the filesystem to its target path at the end of a MapReduce job.

Specified by:: handleOutputs in interface PathTarget

Parameters:: conf - The job Configuration; workingPath - The temp directory that contains the output of the job; index - The index of this target for jobs that write multiple output files to a single directory
Throws:: IOException

getSuccessIndicator

protected org.apache.hadoop.fs.Path getSuccessIndicator()

getSourcePattern

protected org.apache.hadoop.fs.Path getSourcePattern(org.apache.hadoop.fs.Path workingPath,
                                                     int index)

getPath

public org.apache.hadoop.fs.Path getPath()

Specified by:: getPath in interface PathTarget

isCompatible

protected static boolean isCompatible(org.apache.hadoop.fs.FileSystem fs,
                                      org.apache.hadoop.fs.Path path)

getDestFile

protected org.apache.hadoop.fs.Path getDestFile(org.apache.hadoop.conf.Configuration conf,
                                                org.apache.hadoop.fs.Path src,
                                                org.apache.hadoop.fs.Path dir,
                                                boolean mapOnlyJob)
                                         throws IOException

Throws:: IOException

extractPartitionNumber

public static int extractPartitionNumber(String reduceOutputFileName)

Extract the partition number from a raw reducer output filename.

Parameters:: reduceOutputFileName - The raw reducer output file name
Returns:: The partition number encoded in the filename

getFileNamingScheme

public FileNamingScheme getFileNamingScheme()

Description copied from interface: PathTarget

Get the naming scheme to be used for outputs being written to an output path.

Specified by:: getFileNamingScheme in interface PathTarget

Returns:: the naming scheme to be used

equals

public boolean equals(Object other)

Overrides:: equals in class Object

hashCode

public int hashCode()

Overrides:: hashCode in class Object

toString

public String toString()

Overrides:: toString in class Object

asSourceTarget

public <T> SourceTarget<T> asSourceTarget(PType<T> ptype)

Description copied from interface: Target

Attempt to create the SourceTarget type that corresponds to this Target for the given PType, if possible. If it is not possible, return null.

Specified by:: asSourceTarget in interface Target

Parameters:: ptype - The PType to use in constructing the SourceTarget
Returns:: A new SourceTarget or null if such a SourceTarget does not exist

handleExisting

public boolean handleExisting(Target.WriteMode strategy,
                              long lastModForSource,
                              org.apache.hadoop.conf.Configuration conf)

Description copied from interface: Target

Apply the given WriteMode to this Target instance.

Specified by:: handleExisting in interface Target

Parameters:: strategy - The strategy for handling existing outputs; conf - The ever-useful Configuration instance
Returns:: true if the target did exist

Overview

Package

Class

Use

Tree

Deprecated

Index

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

org.apache.crunch.io.impl Class FileTargetImpl

path

FileTargetImpl

FileTargetImpl

outputConf

configureForMapReduce

configureForMapReduce

configureForMapReduce

accept

getConverter

handleOutputs

getSuccessIndicator

getSourcePattern

getPath

isCompatible

getDestFile

extractPartitionNumber

getFileNamingScheme

equals

hashCode

toString

asSourceTarget

handleExisting

org.apache.crunch.io.impl
Class FileTargetImpl