org.apache.pig.piggybank.evaluation
Class Stitch
java.lang.Object
org.apache.pig.EvalFunc<DataBag>
org.apache.pig.piggybank.evaluation.Stitch
public class Stitch
- extends EvalFunc<DataBag>
Given a set of bags, stitch them together tuple by tuple. That is,
assuming the bags have row numbers join them by row number. So given
two bags
{(1, 2), (3, 4)} and
{(5, 6), (7, 8)} the result will be
{(1, 2, 5, 6), (3, 4, 7, 8)}
In general it is assumed that each bag has the same number of tuples.
The implementation uses the first bag to determine the number of tuples
placed in the output. If bags beyond the first have fewer tuples then
the resulting tuples will have fewer fields. Nulls will not be filled in.
Any number of bags can be passed to this function.
Methods inherited from class org.apache.pig.EvalFunc |
finish, getArgToFuncMapping, getCacheFiles, getInputSchema, getLogger, getPigLogger, getReporter, getReturnType, getSchemaName, getSchemaType, isAsynchronous, progress, setInputSchema, setPigLogger, setReporter, setUDFContextSignature, warn |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Stitch
public Stitch()
exec
public DataBag exec(Tuple input)
throws IOException
- Description copied from class:
EvalFunc
- This callback method must be implemented by all subclasses. This
is the method that will be invoked on every Tuple of a given dataset.
Since the dataset may be divided up in a variety of ways the programmer
should not make assumptions about state that is maintained between
invocations of this method.
- Specified by:
exec
in class EvalFunc<DataBag>
- Parameters:
input
- the Tuple to be processed.
- Returns:
- result, of type T.
- Throws:
IOException
outputSchema
public Schema outputSchema(Schema inputSch)
- Description copied from class:
EvalFunc
- Report the schema of the output of this UDF. Pig will make use of
this in error checking, optimization, and planning. The schema
of input data to this UDF is provided.
The default implementation interprets the OutputSchema
annotation,
if one is present. Otherwise, it returns null
(no known output schema).
- Overrides:
outputSchema
in class EvalFunc<DataBag>
- Parameters:
inputSch
- Schema of the input
- Returns:
- Schema of the output
Copyright © 2007-2012 The Apache Software Foundation