org.apache.pig.piggybank.storage.avro
Class PigAvroInputFormat
java.lang.Object
org.apache.hadoop.mapreduce.InputFormat<K,V>
org.apache.hadoop.mapreduce.lib.input.FileInputFormat<org.apache.hadoop.io.NullWritable,org.apache.hadoop.io.Writable>
org.apache.pig.piggybank.storage.avro.PigAvroInputFormat
public class PigAvroInputFormat
- extends org.apache.hadoop.mapreduce.lib.input.FileInputFormat<org.apache.hadoop.io.NullWritable,org.apache.hadoop.io.Writable>
The InputFormat for avro data.
Nested classes/interfaces inherited from class org.apache.hadoop.mapreduce.lib.input.FileInputFormat |
org.apache.hadoop.mapreduce.lib.input.FileInputFormat.Counter |
Constructor Summary |
PigAvroInputFormat()
empty constructor |
PigAvroInputFormat(org.apache.avro.Schema readerSchema,
boolean ignoreBadFiles,
Map<org.apache.hadoop.fs.Path,Map<Integer,Integer>> schemaToMergedSchemaMap,
boolean useMultipleSchemas)
constructor called by AvroStorage to pass in schema and ignoreBadFiles. |
Method Summary |
org.apache.hadoop.mapreduce.RecordReader<org.apache.hadoop.io.NullWritable,org.apache.hadoop.io.Writable> |
createRecordReader(org.apache.hadoop.mapreduce.InputSplit split,
org.apache.hadoop.mapreduce.TaskAttemptContext context)
Create and return an avro record reader. |
protected List<org.apache.hadoop.fs.FileStatus> |
listStatus(org.apache.hadoop.mapreduce.JobContext job)
|
Methods inherited from class org.apache.hadoop.mapreduce.lib.input.FileInputFormat |
addInputPath, addInputPaths, computeSplitSize, getBlockIndex, getFormatMinSplitSize, getInputPathFilter, getInputPaths, getMaxSplitSize, getMinSplitSize, getSplits, isSplitable, setInputPathFilter, setInputPaths, setInputPaths, setMaxInputSplitSize, setMinInputSplitSize |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
PigAvroInputFormat
public PigAvroInputFormat()
- empty constructor
PigAvroInputFormat
public PigAvroInputFormat(org.apache.avro.Schema readerSchema,
boolean ignoreBadFiles,
Map<org.apache.hadoop.fs.Path,Map<Integer,Integer>> schemaToMergedSchemaMap,
boolean useMultipleSchemas)
- constructor called by AvroStorage to pass in schema and ignoreBadFiles.
- Parameters:
readerSchema
- reader schemaignoreBadFiles
- whether ignore corrupted files during loadschemaToMergedSchemaMap
- map that associates each input record
with a remapping of its fields relative to the merged schema
createRecordReader
public org.apache.hadoop.mapreduce.RecordReader<org.apache.hadoop.io.NullWritable,org.apache.hadoop.io.Writable> createRecordReader(org.apache.hadoop.mapreduce.InputSplit split,
org.apache.hadoop.mapreduce.TaskAttemptContext context)
throws IOException,
InterruptedException
- Create and return an avro record reader.
It uses the input schema passed in to the
constructor.
- Specified by:
createRecordReader
in class org.apache.hadoop.mapreduce.InputFormat<org.apache.hadoop.io.NullWritable,org.apache.hadoop.io.Writable>
- Throws:
IOException
InterruptedException
listStatus
protected List<org.apache.hadoop.fs.FileStatus> listStatus(org.apache.hadoop.mapreduce.JobContext job)
throws IOException
- Overrides:
listStatus
in class org.apache.hadoop.mapreduce.lib.input.FileInputFormat<org.apache.hadoop.io.NullWritable,org.apache.hadoop.io.Writable>
- Throws:
IOException
Copyright © 2007-2012 The Apache Software Foundation