Analyzer
Resolve attribute, relation, UDF, etc.
For example, for selfjoin, df.join(df, df("key")==df("key")) Internally, both left and right of the condition is solved to the same attribute reference. We have differenciate two df as two total different DataFrame, which means the outputs of two dfs have to have different set of AttributeReference. It is solved by dedupRight in the ResolveReferences of Analyzer.
In addition, it also rewrite the logical plan if necessary. For example, it extracts generator, window expressions to form a new logical plan.