|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectorg.apache.pig.EvalFunc<T>
@InterfaceAudience.Public @InterfaceStability.Stable public abstract class EvalFunc<T>
The class is used to implement functions to be applied to fields in a dataset. The function is applied to each Tuple in the set. The programmer should not make assumptions about state maintained between invocations of the exec() method since the Pig runtime will schedule and localize invocations based on information provided at runtime. The programmer also should not make assumptions about when or how many times the class will be instantiated, since it may be instantiated multiple times in both the front and back end.
Nested Class Summary | |
---|---|
static class |
EvalFunc.SchemaType
EvalFunc's schema type. |
Field Summary | |
---|---|
protected org.apache.commons.logging.Log |
log
Logging object. |
protected PigLogger |
pigLogger
Logger for aggregating warnings. |
protected PigProgressable |
reporter
Reporter to send heartbeats to Hadoop. |
protected Type |
returnType
Return type of this instance of EvalFunc. |
Constructor Summary | |
---|---|
EvalFunc()
|
Method Summary | |
---|---|
abstract T |
exec(Tuple input)
This callback method must be implemented by all subclasses. |
void |
finish()
Placeholder for cleanup to be performed at the end. |
List<FuncSpec> |
getArgToFuncMapping()
Allow a UDF to specify type specific implementations of itself. |
List<String> |
getCacheFiles()
Allow a UDF to specify a list of files it would like placed in the distributed cache. |
Schema |
getInputSchema()
This method is intended to be called by the user in EvalFunc to get the input
schema of the EvalFunc |
org.apache.commons.logging.Log |
getLogger()
|
PigLogger |
getPigLogger()
|
PigProgressable |
getReporter()
|
Type |
getReturnType()
Get the Type that this EvalFunc returns. |
protected String |
getSchemaName(String name,
Schema input)
|
EvalFunc.SchemaType |
getSchemaType()
Returns the EvalFunc.SchemaType of the EvalFunc. |
boolean |
isAsynchronous()
Deprecated. |
Schema |
outputSchema(Schema input)
Report the schema of the output of this UDF. |
void |
progress()
Utility method to allow UDF to report progress. |
void |
setInputSchema(Schema input)
This method is for internal use. |
void |
setPigLogger(PigLogger pigLogger)
Set the PigLogger object. |
void |
setReporter(PigProgressable reporter)
Set the reporter. |
void |
setUDFContextSignature(String signature)
This method will be called by Pig both in the front end and back end to pass a unique signature to the EvalFunc . |
void |
warn(String msg,
Enum warningEnum)
Issue a warning. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
protected PigProgressable reporter
PigProgressable.progress()
should be called
occasionally to avoid timeouts. Default Hadoop timeout is 600 seconds.
protected org.apache.commons.logging.Log log
protected PigLogger pigLogger
PigLogger.warn(java.lang.Object, java.lang.String, java.lang.Enum>)
.
protected Type returnType
Constructor Detail |
---|
public EvalFunc()
Method Detail |
---|
protected String getSchemaName(String name, Schema input)
public Type getReturnType()
public final void progress()
PigProgressable.progress()
should be called
occasionally to avoid timeouts. Default Hadoop timeout is 600 seconds.
public final void warn(String msg, Enum warningEnum)
msg
- String message of the warningwarningEnum
- type of warningpublic void finish()
public abstract T exec(Tuple input) throws IOException
input
- the Tuple to be processed.
IOException
public Schema outputSchema(Schema input)
The default implementation interprets the OutputSchema
annotation,
if one is present. Otherwise, it returns null
(no known output schema).
input
- Schema of the input
@Deprecated public boolean isAsynchronous()
public PigProgressable getReporter()
public final void setReporter(PigProgressable reporter)
reporter
- Hadoop reporterpublic List<FuncSpec> getArgToFuncMapping() throws FrontendException
FrontendException
public List<String> getCacheFiles()
public PigLogger getPigLogger()
public final void setPigLogger(PigLogger pigLogger)
pigLogger
- PigLogger object.public org.apache.commons.logging.Log getLogger()
public void setUDFContextSignature(String signature)
EvalFunc
. The signature can be used
to store into the UDFContext
any information which the
EvalFunc
needs to store between various method invocations in the
front end and back end.
signature
- a unique signature to identify this EvalFuncpublic void setInputSchema(Schema input)
public Schema getInputSchema()
EvalFunc
to get the input
schema of the EvalFunc
public EvalFunc.SchemaType getSchemaType()
EvalFunc.SchemaType
of the EvalFunc. User defined functions can override
this method to return EvalFunc.SchemaType.VARARG
. In this case the last FieldSchema
added to the Schema in getArgToFuncMapping()
will be considered as a vararg field.
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |