Reranker Framework (ReFr)
Reranking framework for structure prediction and discriminative language modeling
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Pages
Public Member Functions | Public Attributes | List of all members
hadooputil.HadoopInterface Class Reference

A simple class interface for running hadoop commands. More...

Public Member Functions

def __init__
 
def CheckHDir
 Check if a directory exists on HDFS. More...
 
def CheckRemoveHDir
 Function to check for the existence of a directory on the HDFS. More...
 
def CheckHDFSFile
 Function to check for a file on HDFS. More...
 
def CheckInputFile
 Check for an input file and prepare it for MR processing. More...
 
def CatPipe
 
def CatPipeRead
 
def RunMR
 RunMR Run a MapReduce. More...
 

Public Attributes

 hadoopmr_
 
 hadooplibpath_
 
 hadoopfs_
 
 hadooptest_
 
 hadoopcat_
 
 hadoopput_
 
 hadoopmove_
 
 hadoopget_
 
 hadoopmkdir_
 
 hadooprmr_
 

Detailed Description

A simple class interface for running hadoop commands.

Definition at line 42 of file hadooputil.py.

Constructor & Destructor Documentation

def hadooputil.HadoopInterface.__init__ (   self,
  hbasedir,
  streamingloc,
  minsplitsize,
  tasktimeout,
  libpath 
)

Definition at line 43 of file hadooputil.py.

Member Function Documentation

def hadooputil.HadoopInterface.CatPipe (   self,
  hdfsfiles,
  pipecmd 
)

Definition at line 133 of file hadooputil.py.

def hadooputil.HadoopInterface.CatPipeRead (   self,
  hdfsfiles,
  pipecmd,
  retval 
)

Definition at line 138 of file hadooputil.py.

def hadooputil.HadoopInterface.CheckHDFSFile (   self,
  filename 
)

Function to check for a file on HDFS.

Parameters
[in]filenameThe file to check for.

Definition at line 86 of file hadooputil.py.

def hadooputil.HadoopInterface.CheckHDir (   self,
  directory 
)

Check if a directory exists on HDFS.

Parameters
[in]directoryname of directory to check
Returns
True if directory exits.

Definition at line 65 of file hadooputil.py.

def hadooputil.HadoopInterface.CheckInputFile (   self,
  inputfile,
  hdfsinputdir,
  outputdir,
  force,
  uncompress 
)

Check for an input file and prepare it for MR processing.

Parameters
[in]inputfileName of the local file to prepare.
[in]hdfsinputdirHDFS directory for data staging.
[in]outputdirLocal file system directory for the output.
[in]forceReprocess data even if files already exist.
[in]uncompressUncompress data, if compressed, before running.
Returns
A MR input string. Stage the input data for MapReduce processing. If we are uncompressing compressed files, then we move uncompressed data HDFS; otherwise, we simply copy the data to HDFS.

Definition at line 102 of file hadooputil.py.

def hadooputil.HadoopInterface.CheckRemoveHDir (   self,
  directory,
  remove 
)

Function to check for the existence of a directory on the HDFS.

Parameters
[in]directoryDirecotry to check.
[in]removeRemove the directory if it exists.
Returns
True if it did not exist or was removed.

Definition at line 73 of file hadooputil.py.

def hadooputil.HadoopInterface.RunMR (   self,
  input_files,
  outputdir,
  reduce_tasks,
  reducer,
  mapper,
  mroptions 
)

RunMR Run a MapReduce.

Parameters
[in]input_filesHDFS location of input files.
[in]outputdirHDFS location of output.
[in]reduce_tasksNumber of reducer tasks (0 = use default).
[in]reducerFull string of streaming reducer command.
[in]mapperFull string of streaming mapper command.
[in]mroptionsAddition streaming MR options (usually specified with -D).

Definition at line 151 of file hadooputil.py.

Member Data Documentation

hadooputil.HadoopInterface.hadoopcat_

Definition at line 55 of file hadooputil.py.

hadooputil.HadoopInterface.hadoopfs_

Definition at line 53 of file hadooputil.py.

hadooputil.HadoopInterface.hadoopget_

Definition at line 58 of file hadooputil.py.

hadooputil.HadoopInterface.hadooplibpath_

Definition at line 50 of file hadooputil.py.

hadooputil.HadoopInterface.hadoopmkdir_

Definition at line 59 of file hadooputil.py.

hadooputil.HadoopInterface.hadoopmove_

Definition at line 57 of file hadooputil.py.

hadooputil.HadoopInterface.hadoopmr_

Definition at line 44 of file hadooputil.py.

hadooputil.HadoopInterface.hadoopput_

Definition at line 56 of file hadooputil.py.

hadooputil.HadoopInterface.hadooprmr_

Definition at line 60 of file hadooputil.py.

hadooputil.HadoopInterface.hadooptest_

Definition at line 54 of file hadooputil.py.


The documentation for this class was generated from the following file: