public class SparkDynamicPartitionPruningResolver extends Object implements PhysicalPlanResolver
MapWork and target MapWork aren't in
dependent SparkTasks.
When DPP is run, the source MapWork produces a temp file that is read by the target MapWork. The
source MapWork must be run before the target MapWork is run, otherwise the target MapWork
will throw a FileNotFoundException. In order to guarantee this, the source MapWork must be
inside a SparkTask that runs before the SparkTask containing the target MapWork.
This PhysicalPlanResolver works by walking through the Task DAG and iterating over all the
SparkPartitionPruningSinkOperators inside the SparkTask. For each sink operator, it takes the
target MapWork and checks if it exists in any of the child SparkTasks. If the target MapWork
is not in any child SparkTask then it removes the operator subtree that contains the
SparkPartitionPruningSinkOperator.
| Constructor and Description |
|---|
SparkDynamicPartitionPruningResolver() |
| Modifier and Type | Method and Description |
|---|---|
PhysicalContext |
resolve(PhysicalContext pctx)
All physical plan resolvers have to implement this entry method.
|
public SparkDynamicPartitionPruningResolver()
public PhysicalContext resolve(PhysicalContext pctx) throws SemanticException
PhysicalPlanResolverresolve in interface PhysicalPlanResolverSemanticExceptionCopyright © 2019 The Apache Software Foundation. All Rights Reserved.