Combining Multiple PySpark data frames with Different Number of Columns [duplicate]

  apache-spark-sql, pyspark, python

Suppose you have 5 PySpark data frames with a different number of columns. For example, suppose that:

a1 = [a,b,c,d,f,g,1,2,3]
a2   [a,b,c,d,f,g,4,5,6,7]
a3 = [a,b,c,d,f,g,8,9,10,11,12]
a4   [a,b,c,d,f,g,13,14]
a5   [a,b,c,d,f,g,15]

where everything inside the brackets is a column name of data frame ai. What is an easy way to merge all of these data frames by column (adding the extra columns).

Source: Python Questions

LEAVE A COMMENT