|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectorg.apache.pig.builtin.JsonMetadata
public class JsonMetadata
Reads and Writes metadata using JSON in metafiles next to the data.
Constructor Summary | |
---|---|
JsonMetadata()
|
|
JsonMetadata(String schemaFileName,
String headerFileName,
String statFileName)
|
Method Summary | |
---|---|
protected Set<ElementDescriptor> |
findMetaFile(String path,
String metaname,
org.apache.hadoop.conf.Configuration conf)
. |
String[] |
getPartitionKeys(String location,
org.apache.hadoop.mapreduce.Job job)
Find what columns are partition keys for this input. |
ResourceSchema |
getSchema(String location,
org.apache.hadoop.mapreduce.Job job)
For JsonMetadata schema is considered optional This method suppresses (and logs) errors if they are encountered. |
ResourceSchema |
getSchema(String location,
org.apache.hadoop.mapreduce.Job job,
boolean isSchemaOn)
Read the schema from json metadata file If isSchemaOn parameter is false, the errors are suppressed and logged |
ResourceStatistics |
getStatistics(String location,
org.apache.hadoop.mapreduce.Job job)
For JsonMetadata stats are considered optional This method suppresses (and logs) errors if they are encountered. |
void |
setFieldDel(byte fieldDel)
|
void |
setPartitionFilter(Expression partitionFilter)
Set the filter for partitioning. |
void |
setRecordDel(byte recordDel)
|
void |
storeSchema(ResourceSchema schema,
String location,
org.apache.hadoop.mapreduce.Job job)
Store schema of the data being written |
void |
storeStatistics(ResourceStatistics stats,
String location,
org.apache.hadoop.mapreduce.Job job)
Store statistics about the data being written. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
---|
public JsonMetadata()
public JsonMetadata(String schemaFileName, String headerFileName, String statFileName)
Method Detail |
---|
protected Set<ElementDescriptor> findMetaFile(String path, String metaname, org.apache.hadoop.conf.Configuration conf) throws IOException
For each object represented by the path (either directly, or via a glob): If object is a directory, and path/metaname exists, use that as the metadata file. Else if parentPath/metaname exists, use that as the metadata file.
Resolving conflicts, merging the metadata, etc, is not handled by this method and should be taken care of by downstream code.
path
- Path, as passed in to a LoadFunc (may be a Hadoop glob)metaname
- Metadata file designation, such as .pig_schema or .pig_statsconf
- configuration object
IOException
public String[] getPartitionKeys(String location, org.apache.hadoop.mapreduce.Job job)
LoadMetadata
getPartitionKeys
in interface LoadMetadata
location
- Location as returned by
LoadFunc.relativeToAbsolutePath(String, org.apache.hadoop.fs.Path)
job
- The Job
object - this should be used only to obtain
cluster properties through JobContext.getConfiguration()
and not to set/query
any runtime job information.
public void setPartitionFilter(Expression partitionFilter) throws IOException
LoadMetadata
LoadMetadata.getPartitionKeys(String, Job)
, then this method is not
called by Pig runtime. This method is also not called by the Pig runtime
if there are no partition filter conditions.
setPartitionFilter
in interface LoadMetadata
partitionFilter
- that describes filter for partitioning
IOException
- if the filter is not compatible with the storage
mechanism or contains non-partition fields.public ResourceSchema getSchema(String location, org.apache.hadoop.mapreduce.Job job) throws IOException
getSchema
in interface LoadMetadata
location
- Location as returned by
LoadFunc.relativeToAbsolutePath(String, org.apache.hadoop.fs.Path)
job
- The Job
object - this should be used only to obtain
cluster properties through JobContext.getConfiguration()
and not to set/query
any runtime job information.
IOException
- if an exception occurs while determining the schemapublic ResourceSchema getSchema(String location, org.apache.hadoop.mapreduce.Job job, boolean isSchemaOn) throws IOException
location
- job
- isSchemaOn
-
IOException
public ResourceStatistics getStatistics(String location, org.apache.hadoop.mapreduce.Job job) throws IOException
getStatistics
in interface LoadMetadata
location
- Location as returned by
LoadFunc.relativeToAbsolutePath(String, org.apache.hadoop.fs.Path)
job
- The Job
object - this should be used only to obtain
cluster properties through JobContext.getConfiguration()
and not to set/query
any runtime job information.
IOException
- if an exception occurs while retrieving statisticsLoadMetadata.getStatistics(String, Job)
public void storeStatistics(ResourceStatistics stats, String location, org.apache.hadoop.mapreduce.Job job) throws IOException
StoreMetadata
storeStatistics
in interface StoreMetadata
stats
- statistics to be recordedlocation
- Location as returned by
LoadFunc.relativeToAbsolutePath(String, org.apache.hadoop.fs.Path)
job
- The Job
object - this should be used only to obtain
cluster properties through JobContext.getConfiguration()
and not to set/query
any runtime job information.
IOException
public void storeSchema(ResourceSchema schema, String location, org.apache.hadoop.mapreduce.Job job) throws IOException
StoreMetadata
storeSchema
in interface StoreMetadata
schema
- Schema to be recordedlocation
- Location as returned by
LoadFunc.relativeToAbsolutePath(String, org.apache.hadoop.fs.Path)
job
- The Job
object - this should be used only to obtain
cluster properties through JobContext.getConfiguration()
and not to set/query
any runtime job information.
IOException
public void setFieldDel(byte fieldDel)
public void setRecordDel(byte recordDel)
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |