This project has retired. For details please refer to its Attic page.
DistCache (Apache Crunch 0.4.0-incubating API)

org.apache.crunch.util
Class DistCache

java.lang.Object
  extended by org.apache.crunch.util.DistCache

public class DistCache
extends Object

Provides functions for working with Hadoop's distributed cache. These include:


Constructor Summary
DistCache()
           
 
Method Summary
static void addJarDirToDistributedCache(org.apache.hadoop.conf.Configuration conf, File jarDirectory)
          Adds all jars under the specified directory to the distributed cache of jobs using the provided configuration.
static void addJarDirToDistributedCache(org.apache.hadoop.conf.Configuration conf, String jarDirectory)
          Adds all jars under the directory at the specified path to the distributed cache of jobs using the provided configuration.
static void addJarToDistributedCache(org.apache.hadoop.conf.Configuration conf, File jarFile)
          Adds the specified jar to the distributed cache of jobs using the provided configuration.
static void addJarToDistributedCache(org.apache.hadoop.conf.Configuration conf, String jarFile)
          Adds the jar at the specified path to the distributed cache of jobs using the provided configuration.
static String findContainingJar(Class jarClass)
          Finds the path to a jar that contains the class provided, if any.
static Object read(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.Path path)
           
static void write(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.Path path, Object value)
           
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

DistCache

public DistCache()
Method Detail

write

public static void write(org.apache.hadoop.conf.Configuration conf,
                         org.apache.hadoop.fs.Path path,
                         Object value)
                  throws IOException
Throws:
IOException

read

public static Object read(org.apache.hadoop.conf.Configuration conf,
                          org.apache.hadoop.fs.Path path)
                   throws IOException
Throws:
IOException

addJarToDistributedCache

public static void addJarToDistributedCache(org.apache.hadoop.conf.Configuration conf,
                                            File jarFile)
                                     throws IOException
Adds the specified jar to the distributed cache of jobs using the provided configuration. The jar will be placed on the classpath of tasks run by the job.

Parameters:
conf - The configuration used to add the jar to the distributed cache.
jarFile - The jar file to add to the distributed cache.
Throws:
IOException - If the jar file does not exist or there is a problem accessing the file.

addJarToDistributedCache

public static void addJarToDistributedCache(org.apache.hadoop.conf.Configuration conf,
                                            String jarFile)
                                     throws IOException
Adds the jar at the specified path to the distributed cache of jobs using the provided configuration. The jar will be placed on the classpath of tasks run by the job.

Parameters:
conf - The configuration used to add the jar to the distributed cache.
jarFile - The path to the jar file to add to the distributed cache.
Throws:
IOException - If the jar file does not exist or there is a problem accessing the file.

findContainingJar

public static String findContainingJar(Class jarClass)
                                throws IOException
Finds the path to a jar that contains the class provided, if any. There is no guarantee that the jar returned will be the first on the classpath to contain the file. This method is basically lifted out of Hadoop's JobConf class.

Parameters:
jarClass - The class the jar file should contain.
Returns:
The path to a jar file that contains the class, or null if no such jar exists.
Throws:
IOException - If there is a problem searching for the jar file.

addJarDirToDistributedCache

public static void addJarDirToDistributedCache(org.apache.hadoop.conf.Configuration conf,
                                               File jarDirectory)
                                        throws IOException
Adds all jars under the specified directory to the distributed cache of jobs using the provided configuration. The jars will be placed on the classpath of tasks run by the job. This method does not descend into subdirectories when adding jars.

Parameters:
conf - The configuration used to add jars to the distributed cache.
jarDirectory - A directory containing jar files to add to the distributed cache.
Throws:
IOException - If the directory does not exist or there is a problem accessing the directory.

addJarDirToDistributedCache

public static void addJarDirToDistributedCache(org.apache.hadoop.conf.Configuration conf,
                                               String jarDirectory)
                                        throws IOException
Adds all jars under the directory at the specified path to the distributed cache of jobs using the provided configuration. The jars will be placed on the classpath of the tasks run by the job. This method does not descend into subdirectories when adding jars.

Parameters:
conf - The configuration used to add jars to the distributed cache.
jarDirectory - The path to a directory containing jar files to add to the distributed cache.
Throws:
IOException - If the directory does not exist or there is a problem accessing the directory.


Copyright © 2012 The Apache Software Foundation. All Rights Reserved.