EnsureRequirements

If the parent SparkPlan operates on two or more child SparkPlan, e.g., Join, this phase is to ensure 1). All its children outputPartitioning have to satisfy the Distribution requirements. If not, a ShuffleExchange will be inserted with new paritioning created by createPartitioning. 1) its children has the same outputPartitioning mechanism, HashPartitioning, RangePartitioning, etc. 2) All the outputParitioning has to compatible with each other, e.g., same number of hash buckets, same range, etc. Otherwise, a new ShuffleExchange has to be inserted for each children (if the child itself is ShuffleExchange, the child will be replaced with the new ShuffleExchange)

The phase not only add ShuffleExchange, but also add Sort if a plan requires requiredChildOrderings

results matching ""

    No results matching ""