org.apache.pig.backend.hadoop.executionengine.mapReduceLayer
Class PigGenericMapBase
java.lang.Object
org.apache.hadoop.mapreduce.Mapper<org.apache.hadoop.io.Text,Tuple,PigNullableWritable,org.apache.hadoop.io.Writable>
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase
- Direct Known Subclasses:
- org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase
public abstract class PigGenericMapBase
- extends org.apache.hadoop.mapreduce.Mapper<org.apache.hadoop.io.Text,Tuple,PigNullableWritable,org.apache.hadoop.io.Writable>
This class is the base class for PigMapBase, which has slightly
difference among different versions of hadoop. PigMapBase implementation
is located in $PIG_HOME/shims.
Nested classes/interfaces inherited from class org.apache.hadoop.mapreduce.Mapper |
org.apache.hadoop.mapreduce.Mapper.Context |
Method Summary |
void |
cleanup(org.apache.hadoop.mapreduce.Mapper.Context context)
Will be called when all the tuples in the input
are done. |
abstract void |
collect(org.apache.hadoop.mapreduce.Mapper.Context oc,
Tuple tuple)
|
abstract org.apache.hadoop.mapreduce.Mapper.Context |
getIllustratorContext(org.apache.hadoop.conf.Configuration conf,
DataBag input,
List<Pair<PigNullableWritable,org.apache.hadoop.io.Writable>> output,
org.apache.hadoop.mapreduce.InputSplit split)
|
byte |
getKeyType()
|
abstract boolean |
inIllustrator(org.apache.hadoop.mapreduce.Mapper.Context context)
|
protected void |
map(org.apache.hadoop.io.Text key,
Tuple inpTuple,
org.apache.hadoop.mapreduce.Mapper.Context context)
The map function that attaches the inpTuple appropriately
and executes the map plan if its not empty. |
protected void |
runPipeline(PhysicalOperator leaf)
|
void |
setKeyType(byte keyType)
|
void |
setMapPlan(PhysicalPlan plan)
for local map/reduce simulation |
void |
setup(org.apache.hadoop.mapreduce.Mapper.Context context)
Configures the mapper with the map plan and the
reproter thread |
Methods inherited from class org.apache.hadoop.mapreduce.Mapper |
run |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
keyType
protected byte keyType
mp
protected PhysicalPlan mp
stores
protected List<POStore> stores
tf
protected TupleFactory tf
errorInMap
protected boolean errorInMap
PigGenericMapBase
public PigGenericMapBase()
setMapPlan
public void setMapPlan(PhysicalPlan plan)
- for local map/reduce simulation
- Parameters:
plan
- the map plan
cleanup
public void cleanup(org.apache.hadoop.mapreduce.Mapper.Context context)
throws IOException,
InterruptedException
- Will be called when all the tuples in the input
are done. So reporter thread should be closed.
- Overrides:
cleanup
in class org.apache.hadoop.mapreduce.Mapper<org.apache.hadoop.io.Text,Tuple,PigNullableWritable,org.apache.hadoop.io.Writable>
- Throws:
IOException
InterruptedException
setup
public void setup(org.apache.hadoop.mapreduce.Mapper.Context context)
throws IOException,
InterruptedException
- Configures the mapper with the map plan and the
reproter thread
- Overrides:
setup
in class org.apache.hadoop.mapreduce.Mapper<org.apache.hadoop.io.Text,Tuple,PigNullableWritable,org.apache.hadoop.io.Writable>
- Throws:
IOException
InterruptedException
map
protected void map(org.apache.hadoop.io.Text key,
Tuple inpTuple,
org.apache.hadoop.mapreduce.Mapper.Context context)
throws IOException,
InterruptedException
- The map function that attaches the inpTuple appropriately
and executes the map plan if its not empty. Collects the
result of execution into oc or the input directly to oc
if map plan empty. The collection is left abstract for the
map-only or map-reduce job to implement. Map-only collects
the tuple as-is whereas map-reduce collects it after extracting
the key and indexed tuple.
- Overrides:
map
in class org.apache.hadoop.mapreduce.Mapper<org.apache.hadoop.io.Text,Tuple,PigNullableWritable,org.apache.hadoop.io.Writable>
- Throws:
IOException
InterruptedException
runPipeline
protected void runPipeline(PhysicalOperator leaf)
throws IOException,
InterruptedException
- Throws:
IOException
InterruptedException
collect
public abstract void collect(org.apache.hadoop.mapreduce.Mapper.Context oc,
Tuple tuple)
throws InterruptedException,
IOException
- Throws:
InterruptedException
IOException
inIllustrator
public abstract boolean inIllustrator(org.apache.hadoop.mapreduce.Mapper.Context context)
getKeyType
public byte getKeyType()
- Returns:
- the keyType
setKeyType
public void setKeyType(byte keyType)
- Parameters:
keyType
- the keyType to set
getIllustratorContext
public abstract org.apache.hadoop.mapreduce.Mapper.Context getIllustratorContext(org.apache.hadoop.conf.Configuration conf,
DataBag input,
List<Pair<PigNullableWritable,org.apache.hadoop.io.Writable>> output,
org.apache.hadoop.mapreduce.InputSplit split)
throws IOException,
InterruptedException
- Throws:
IOException
InterruptedException
Copyright © 2007-2012 The Apache Software Foundation