MapredParquetOutputFormat (Hive 3.1.1 API)

java.lang.Object
- org.apache.hadoop.mapred.FileOutputFormat<org.apache.hadoop.io.NullWritable,ParquetHiveRecord>
- - org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat

All Implemented Interfaces:

HiveOutputFormat<org.apache.hadoop.io.NullWritable,ParquetHiveRecord>, org.apache.hadoop.mapred.OutputFormat<org.apache.hadoop.io.NullWritable,ParquetHiveRecord>
```
public class MapredParquetOutputFormat
extends org.apache.hadoop.mapred.FileOutputFormat<org.apache.hadoop.io.NullWritable,ParquetHiveRecord>
implements HiveOutputFormat<org.apache.hadoop.io.NullWritable,ParquetHiveRecord>
```
A Parquet OutputFormat for Hive (with the deprecated package mapred)

Nested Class Summary
- Nested classes/interfaces inherited from class org.apache.hadoop.mapred.FileOutputFormat
  org.apache.hadoop.mapred.FileOutputFormat.Counter

Field Summary

Fields
Modifier and Type Field and Description

protected org.apache.parquet.hadoop.ParquetOutputFormat<ParquetHiveRecord> realOutputFormat

Fields
Modifier and Type	Field and Description
`protected org.apache.parquet.hadoop.ParquetOutputFormat<ParquetHiveRecord>`	`realOutputFormat`

Constructor Summary

Constructors
Constructor and Description

MapredParquetOutputFormat()

MapredParquetOutputFormat(org.apache.hadoop.mapreduce.OutputFormat<Void,ParquetHiveRecord> mapreduceOutputFormat)

Constructors
Constructor and Description
`MapredParquetOutputFormat()`
`MapredParquetOutputFormat(org.apache.hadoop.mapreduce.OutputFormat<Void,ParquetHiveRecord> mapreduceOutputFormat)`

Method Summary

All Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`void`	`checkOutputSpecs(org.apache.hadoop.fs.FileSystem ignored, org.apache.hadoop.mapred.JobConf job)`
`FileSinkOperator.RecordWriter`	`getHiveRecordWriter(org.apache.hadoop.mapred.JobConf jobConf, org.apache.hadoop.fs.Path finalOutPath, Class<? extends org.apache.hadoop.io.Writable> valueClass, boolean isCompressed, Properties tableProperties, org.apache.hadoop.util.Progressable progress)` Create the parquet schema from the hive schema, and return the RecordWriterWrapper which contains the real output format
`protected ParquetRecordWriterWrapper`	`getParquerRecordWriterWrapper(org.apache.parquet.hadoop.ParquetOutputFormat<ParquetHiveRecord> realOutputFormat, org.apache.hadoop.mapred.JobConf jobConf, String finalOutPath, org.apache.hadoop.util.Progressable progress, Properties tableProperties)`
`org.apache.hadoop.mapred.RecordWriter<org.apache.hadoop.io.NullWritable,ParquetHiveRecord>`	`getRecordWriter(org.apache.hadoop.fs.FileSystem ignored, org.apache.hadoop.mapred.JobConf job, String name, org.apache.hadoop.util.Progressable progress)`

Methods inherited from class org.apache.hadoop.mapred.FileOutputFormat
getCompressOutput, getOutputCompressorClass, getOutputPath, getPathForCustomFile, getTaskOutputPath, getUniqueName, getWorkOutputPath, setCompressOutput, setOutputCompressorClass, setOutputPath, setWorkOutputPath

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Field Detail

realOutputFormat

protected org.apache.parquet.hadoop.ParquetOutputFormat<ParquetHiveRecord> realOutputFormat

Constructor Detail

MapredParquetOutputFormat
```
public MapredParquetOutputFormat()
```

MapredParquetOutputFormat

public MapredParquetOutputFormat(org.apache.hadoop.mapreduce.OutputFormat<Void,ParquetHiveRecord> mapreduceOutputFormat)

Method Detail

checkOutputSpecs
```
public void checkOutputSpecs(org.apache.hadoop.fs.FileSystem ignored,
                             org.apache.hadoop.mapred.JobConf job)
                      throws IOException
```
Specified by:

checkOutputSpecs in interface org.apache.hadoop.mapred.OutputFormat<org.apache.hadoop.io.NullWritable,ParquetHiveRecord>

Overrides:

checkOutputSpecs in class org.apache.hadoop.mapred.FileOutputFormat<org.apache.hadoop.io.NullWritable,ParquetHiveRecord>

Throws:

IOException

getRecordWriter

public org.apache.hadoop.mapred.RecordWriter<org.apache.hadoop.io.NullWritable,ParquetHiveRecord> getRecordWriter(org.apache.hadoop.fs.FileSystem ignored,
                                                                                                                  org.apache.hadoop.mapred.JobConf job,
                                                                                                                  String name,
                                                                                                                  org.apache.hadoop.util.Progressable progress)
                                                                                                           throws IOException

Specified by:: getRecordWriter in interface org.apache.hadoop.mapred.OutputFormat<org.apache.hadoop.io.NullWritable,ParquetHiveRecord>
Specified by:: getRecordWriter in class org.apache.hadoop.mapred.FileOutputFormat<org.apache.hadoop.io.NullWritable,ParquetHiveRecord>
Parameters:: ignored - Unused parameter; job - JobConf - expecting mandatory parameter PARQUET_HIVE_SCHEMA; name - Path to write to; progress - Progress
Returns:
Throws:: IOException

getHiveRecordWriter

public FileSinkOperator.RecordWriter getHiveRecordWriter(org.apache.hadoop.mapred.JobConf jobConf,
                                                         org.apache.hadoop.fs.Path finalOutPath,
                                                         Class<? extends org.apache.hadoop.io.Writable> valueClass,
                                                         boolean isCompressed,
                                                         Properties tableProperties,
                                                         org.apache.hadoop.util.Progressable progress)
                                                  throws IOException

Create the parquet schema from the hive schema, and return the RecordWriterWrapper which contains the real output format

Specified by:: getHiveRecordWriter in interface HiveOutputFormat<org.apache.hadoop.io.NullWritable,ParquetHiveRecord>
Parameters:: jobConf - the job configuration file; finalOutPath - the final output file to be created; valueClass - the value class used for create; isCompressed - whether the content is compressed or not; tableProperties - the table properties of this file's corresponding table; progress - progress used for status report
Returns:: the RecordWriter for the output file
Throws:: IOException

getParquerRecordWriterWrapper

protected ParquetRecordWriterWrapper getParquerRecordWriterWrapper(org.apache.parquet.hadoop.ParquetOutputFormat<ParquetHiveRecord> realOutputFormat,
                                                                   org.apache.hadoop.mapred.JobConf jobConf,
                                                                   String finalOutPath,
                                                                   org.apache.hadoop.util.Progressable progress,
                                                                   Properties tableProperties)
                                                            throws IOException

Throws:: IOException

Class MapredParquetOutputFormat

Nested Class Summary

Nested classes/interfaces inherited from class org.apache.hadoop.mapred.FileOutputFormat

Field Summary

Constructor Summary

Method Summary

Methods inherited from class org.apache.hadoop.mapred.FileOutputFormat

Methods inherited from class java.lang.Object

Field Detail

realOutputFormat

Constructor Detail

MapredParquetOutputFormat

MapredParquetOutputFormat

Method Detail

checkOutputSpecs

getRecordWriter

getHiveRecordWriter

getParquerRecordWriterWrapper