public class HybridHashTableContainer extends Object implements MapJoinTableContainer, MapJoinTableContainerDirectAccess
| Modifier and Type | Class and Description |
|---|---|
static class |
HybridHashTableContainer.HashPartition
This class encapsulates the triplet together since they are closely related to each other
The triplet: hashmap (either in memory or on disk), small table container, big table container
|
MapJoinTableContainer.ReusableGetAdaptor| Constructor and Description |
|---|
HybridHashTableContainer(org.apache.hadoop.conf.Configuration hconf,
long keyCount,
long memoryAvailable,
long estimatedTableSize,
HybridHashTableConf nwayConf) |
| Modifier and Type | Method and Description |
|---|---|
static int |
calcNumPartitions(long memoryThreshold,
long dataSize,
int minNumParts,
int minWbSize)
Calculate how many partitions are needed.
|
void |
clear()
Clears the contents of the table.
|
MapJoinTableContainer.ReusableGetAdaptor |
createGetter(MapJoinKey keyTypeFromLoader)
Creates reusable get adaptor that can be used to retrieve rows from the table
based on either vectorized or non-vectorized input rows to MapJoinOperator.
|
void |
dumpMetrics() |
void |
dumpStats() |
MapJoinKey |
getAnyKey() |
long |
getEstimatedMemorySize()
Returns estimated memory size based
JavaDataModel |
HybridHashTableContainer.HashPartition[] |
getHashPartitions() |
LazyBinaryStructObjectInspector |
getInternalValueOi() |
long |
getMemoryThreshold() |
byte[] |
getNotNullMarkers() |
byte[] |
getNullMarkers() |
int |
getNumPartitions() |
boolean[] |
getSortableSortOrders() |
long |
getTableRowSize() |
int |
getToSpillPartitionId()
Gets the partition Id into which to spill the big table row
|
int |
getTotalInMemRowCount() |
MapJoinBytesTableContainer.KeyValueHelper |
getWriteHelper() |
boolean |
hasSpill()
Checks if the container has spilled any data onto disk.
|
boolean |
isHashMapSpilledOnCreation(int partitionId)
Check if the hash table of a specified partition has been "spilled" to disk when it was created.
|
boolean |
isOnDisk(int partitionId)
Check if the hash table of a specified partition is on disk (or "spilled" on creation)
|
void |
put(org.apache.hadoop.io.Writable currentKey,
org.apache.hadoop.io.Writable currentValue) |
MapJoinKey |
putRow(org.apache.hadoop.io.Writable currentKey,
org.apache.hadoop.io.Writable currentValue)
Adds row from input to the table.
|
void |
seal()
Indicates to the container that the puts have ended; table is now r/o.
|
void |
setSerde(MapJoinObjectSerDeContext keyCtx,
MapJoinObjectSerDeContext valCtx) |
void |
setSpill(boolean isSpilled) |
void |
setTotalInMemRowCount(int totalInMemRowCount) |
int |
size()
Return the size of the hash table
|
long |
spillPartition(int partitionId)
Move the hashtable of a specified partition from memory into local file system
|
public HybridHashTableContainer(org.apache.hadoop.conf.Configuration hconf,
long keyCount,
long memoryAvailable,
long estimatedTableSize,
HybridHashTableConf nwayConf)
throws SerDeException,
IOException
SerDeExceptionIOExceptionpublic long getEstimatedMemorySize()
MemoryEstimateJavaDataModelgetEstimatedMemorySize in interface MemoryEstimatepublic MapJoinBytesTableContainer.KeyValueHelper getWriteHelper()
public HybridHashTableContainer.HashPartition[] getHashPartitions()
public long getMemoryThreshold()
public LazyBinaryStructObjectInspector getInternalValueOi()
public boolean[] getSortableSortOrders()
public byte[] getNullMarkers()
public byte[] getNotNullMarkers()
public MapJoinKey putRow(org.apache.hadoop.io.Writable currentKey, org.apache.hadoop.io.Writable currentValue) throws SerDeException, HiveException, IOException
MapJoinTableContainerputRow in interface MapJoinTableContainerSerDeExceptionHiveExceptionIOExceptionpublic boolean isOnDisk(int partitionId)
partitionId - partition numberpublic boolean isHashMapSpilledOnCreation(int partitionId)
partitionId - hashMap IDpublic long spillPartition(int partitionId)
throws IOException
partitionId - the hashtable to be movedIOExceptionpublic static int calcNumPartitions(long memoryThreshold,
long dataSize,
int minNumParts,
int minWbSize)
throws IOException
memoryThreshold - memory threshold for the given tabledataSize - total data size for the tableminNumParts - minimum required number of partitionsminWbSize - minimum required write buffer sizeIOExceptionpublic int getNumPartitions()
public int getTotalInMemRowCount()
public void setTotalInMemRowCount(int totalInMemRowCount)
public long getTableRowSize()
public boolean hasSpill()
MapJoinTableContainerhasSpill in interface MapJoinTableContainerpublic void setSpill(boolean isSpilled)
public int getToSpillPartitionId()
public void clear()
MapJoinTableContainerclear in interface MapJoinTableContainerpublic MapJoinKey getAnyKey()
getAnyKey in interface MapJoinTableContainerpublic MapJoinTableContainer.ReusableGetAdaptor createGetter(MapJoinKey keyTypeFromLoader)
MapJoinTableContainercreateGetter in interface MapJoinTableContainerkeyTypeFromLoader - Last key from hash table loader, to determine key type used
when loading hashtable (if it can vary).public void seal()
MapJoinTableContainerseal in interface MapJoinTableContainerpublic void put(org.apache.hadoop.io.Writable currentKey,
org.apache.hadoop.io.Writable currentValue)
throws SerDeException,
IOException
put in interface MapJoinTableContainerDirectAccessSerDeExceptionIOExceptionpublic void dumpMetrics()
dumpMetrics in interface MapJoinTableContainerpublic void dumpStats()
public int size()
MapJoinTableContainersize in interface MapJoinTableContainerpublic void setSerde(MapJoinObjectSerDeContext keyCtx, MapJoinObjectSerDeContext valCtx) throws SerDeException
setSerde in interface MapJoinTableContainerSerDeExceptionCopyright © 2019 The Apache Software Foundation. All Rights Reserved.