My goal is joining 3 tables in Pyspark dataframes,
TableA
, TableB
and TableC
all have an ID like a Key to merge.
I want to join three tables and create a new Pyspark dataframe.
Do you have any suggestions?
You can simply join them as below:
final_table = (tableA.join(tableB, on = [tableA.ID == tableB.ID], how = 'inner')
.join(tableC, on = [tableA.ID == tableB.ID], how = 'inner'))
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.