org.apache.pig
Interface Accumulator<T>
- All Known Subinterfaces:
- TerminatingAccumulator<T>
- All Known Implementing Classes:
- AccumulatorEvalFunc, AlgebraicBigDecimalMathBase, AlgebraicBigIntegerMathBase, AlgebraicByteArrayMathBase, AlgebraicDoubleMathBase, AlgebraicEvalFunc, AlgebraicFloatMathBase, AlgebraicIntMathBase, AlgebraicLongMathBase, AVG, BigDecimalAvg, BigDecimalMax, BigDecimalMin, BigDecimalSum, BigIntegerAvg, BigIntegerMax, BigIntegerMin, BigIntegerSum, COUNT, COUNT_STAR, DateTimeMax, DateTimeMin, DoubleAvg, DoubleMax, DoubleMin, DoubleSum, ExtremalTupleByNthField, FloatAvg, FloatMax, FloatMin, FloatSum, GroovyAccumulatorEvalFunc, GroovyAlgebraicEvalFunc, GroovyAlgebraicEvalFunc.BigDecimalGroovyAlgebraicEvalFunc, GroovyAlgebraicEvalFunc.BigIntegerGroovyAlgebraicEvalFunc, GroovyAlgebraicEvalFunc.BooleanGroovyAlgebraicEvalFunc, GroovyAlgebraicEvalFunc.ChararrayGroovyAlgebraicEvalFunc, GroovyAlgebraicEvalFunc.DataBagGroovyAlgebraicEvalFunc, GroovyAlgebraicEvalFunc.DataByteArrayGroovyAlgebraicEvalFunc, GroovyAlgebraicEvalFunc.DateTimeGroovyAlgebraicEvalFunc, GroovyAlgebraicEvalFunc.DoubleGroovyAlgebraicEvalFunc, GroovyAlgebraicEvalFunc.FloatGroovyAlgebraicEvalFunc, GroovyAlgebraicEvalFunc.IntegerGroovyAlgebraicEvalFunc, GroovyAlgebraicEvalFunc.LongGroovyAlgebraicEvalFunc, GroovyAlgebraicEvalFunc.MapGroovyAlgebraicEvalFunc, GroovyAlgebraicEvalFunc.TupleGroovyAlgebraicEvalFunc, IntAvg, IntMax, IntMin, IntSum, IteratingAccumulatorEvalFunc, JrubyAccumulatorEvalFunc, JrubyAlgebraicEvalFunc, JrubyAlgebraicEvalFunc.BagJrubyAlgebraicEvalFunc, JrubyAlgebraicEvalFunc.ChararrayJrubyAlgebraicEvalFunc, JrubyAlgebraicEvalFunc.DataByteArrayJrubyAlgebraicEvalFunc, JrubyAlgebraicEvalFunc.DoubleJrubyAlgebraicEvalFunc, JrubyAlgebraicEvalFunc.FloatJrubyAlgebraicEvalFunc, JrubyAlgebraicEvalFunc.IntegerJrubyAlgebraicEvalFunc, JrubyAlgebraicEvalFunc.LongJrubyAlgebraicEvalFunc, JrubyAlgebraicEvalFunc.MapJrubyAlgebraicEvalFunc, JrubyAlgebraicEvalFunc.TupleJrubyAlgebraicEvalFunc, LongAvg, LongMax, LongMin, LongSum, MAX, MIN, StringMax, StringMin, SUM
@InterfaceAudience.Public
@InterfaceStability.Stable
public interface Accumulator<T>
An interface that allows UDFs that take a bag to accumulate tuples in chunks rather than take
the whole set at once. This is intended for UDFs that do not need to see all of the tuples
together but cannot be used with the combiner. This lowers the memory needs, avoiding the need
to spill large bags, and thus speeds up the query. An example is something like session analysis.
It cannot be used with the combiner because all it's inputs must first be ordered. But it does
not need to see all the tuples at once. UDF implementors might also choose to implement this
interface so that if other UDFs in the FOREACH implement it it can be used.
- Since:
- Pig 0.6
Method Summary |
void |
accumulate(Tuple b)
Pass tuples to the UDF. |
void |
cleanup()
Called after getValue() to prepare processing for next key. |
T |
getValue()
Called when all tuples from current key have been passed to accumulate. |
accumulate
void accumulate(Tuple b)
throws IOException
- Pass tuples to the UDF.
- Parameters:
b
- A tuple containing a single field, which is a bag. The bag will contain the set
of tuples being passed to the UDF in this iteration.
- Throws:
IOException
getValue
T getValue()
- Called when all tuples from current key have been passed to accumulate.
- Returns:
- the value for the UDF for this key.
cleanup
void cleanup()
- Called after getValue() to prepare processing for next key.
Copyright © 2007-2012 The Apache Software Foundation