public abstract class FileInputFormat<K,V> extends InputFormat<K,V>
InputFormats.
FileInputFormat is the base class for all file-based
InputFormats. This provides a generic implementation of
getSplits(JobContext).
Subclasses of FileInputFormat can also override the
isSplitable(JobContext, Path) method to ensure input-files are
not split-up and are processed as a whole by Mappers.
| 限定符和类型 | 类和说明 |
|---|---|
static class |
FileInputFormat.Counter |
| 构造器和说明 |
|---|
FileInputFormat() |
| 限定符和类型 | 方法和说明 |
|---|---|
static void |
addInputPath(Job job,
Path path)
Add a
Path to the list of inputs for the map-reduce job. |
static void |
addInputPaths(Job job,
String commaSeparatedPaths)
Add the given comma separated paths to the list of inputs for
the map-reduce job.
|
protected long |
computeSplitSize(long blockSize,
long minSize,
long maxSize) |
protected int |
getBlockIndex(BlockLocation[] blkLocations,
long offset) |
protected long |
getFormatMinSplitSize()
Get the lower bound on split size imposed by the format.
|
static PathFilter |
getInputPathFilter(JobContext context)
Get a PathFilter instance of the filter set for the input paths.
|
static Path[] |
getInputPaths(JobContext context)
Get the list of input
Paths for the map-reduce job. |
static long |
getMaxSplitSize(JobContext context)
Get the maximum split size.
|
static long |
getMinSplitSize(JobContext job)
Get the minimum split size
|
List<InputSplit> |
getSplits(JobContext job)
Generate the list of files and make them into FileSplits.
|
protected boolean |
isSplitable(JobContext context,
Path filename)
Is the given filename splitable?
|
protected List<FileStatus> |
listStatus(JobContext job)
List input directories.
|
static void |
setInputPathFilter(Job job,
Class<? extends PathFilter> filter)
Set a PathFilter to be applied to the input paths for the map-reduce job.
|
static void |
setInputPaths(Job job,
Path... inputPaths)
Set the array of
Paths as the list of inputs
for the map-reduce job. |
static void |
setInputPaths(Job job,
String commaSeparatedPaths)
Sets the given comma separated paths as the list of inputs
for the map-reduce job.
|
static void |
setMaxInputSplitSize(Job job,
long size)
Set the maximum split size
|
static void |
setMinInputSplitSize(Job job,
long size)
Set the minimum input split size
|
createRecordReaderprotected long getFormatMinSplitSize()
protected boolean isSplitable(JobContext context, Path filename)
FileInputFormat implementations can override this and return
false to ensure that individual input files are never split-up
so that Mappers process entire files.context - the job contextfilename - the file name to checkpublic static void setInputPathFilter(Job job, Class<? extends PathFilter> filter)
job - the job to modifyfilter - the PathFilter class use for filtering the input paths.public static void setMinInputSplitSize(Job job, long size)
job - the job to modifysize - the minimum sizepublic static long getMinSplitSize(JobContext job)
job - the jobpublic static void setMaxInputSplitSize(Job job, long size)
job - the job to modifysize - the maximum split sizepublic static long getMaxSplitSize(JobContext context)
context - the job to look at.public static PathFilter getInputPathFilter(JobContext context)
protected List<FileStatus> listStatus(JobContext job) throws IOException
job - the job to list input paths forIOException - if zero items.public List<InputSplit> getSplits(JobContext job) throws IOException
getSplits 在类中 InputFormat<K,V>job - job configuration.InputSplits for the job.IOExceptionprotected long computeSplitSize(long blockSize,
long minSize,
long maxSize)
protected int getBlockIndex(BlockLocation[] blkLocations, long offset)
public static void setInputPaths(Job job, String commaSeparatedPaths) throws IOException
job - the jobcommaSeparatedPaths - Comma separated paths to be set as
the list of inputs for the map-reduce job.IOExceptionpublic static void addInputPaths(Job job, String commaSeparatedPaths) throws IOException
job - The job to modifycommaSeparatedPaths - Comma separated paths to be added to
the list of inputs for the map-reduce job.IOExceptionpublic static void setInputPaths(Job job, Path... inputPaths) throws IOException
Paths as the list of inputs
for the map-reduce job.job - The job to modifyinputPaths - the Paths of the input directories/files
for the map-reduce job.IOExceptionpublic static void addInputPath(Job job, Path path) throws IOException
Path to the list of inputs for the map-reduce job.job - The Job to modifypath - Path to be added to the list of inputs for
the map-reduce job.IOExceptionpublic static Path[] getInputPaths(JobContext context)
Paths for the map-reduce job.context - The jobPaths for the map-reduce job.Copyright © 2009 The Apache Software Foundation