[英]Join two dataframes in pyspark
I have two data frames:我有两个数据框:
+----+----+
|key1|val1|
+----+----+
|a1 | 1|
|b1 | 2|
+----+----+
+----+----+
|key2|val2|
+----+----+
|a2 | 3|
|b2 | 4|
+----+----+
And then I want to merge these two data frames to get the following data frame:然后我想合并这两个数据框得到以下数据框:
+----+----+----+----+
|key1|val1|key2|val2|
+----+----+
|a1 | 1|a2 | 3|
|a1 | 1|b2 | 4|
|b1 | 2|a2 | 3|
|b1 | 2|b2 | 4|
+----+----+
How can I do this in PySaprk?我怎样才能在 PySaprk 中做到这一点?
Try cross join
as below,尝试
cross join
如下,
df3 = df1.crossJoin(df2)
df3.show()
This should give output as you want.这应该给 output 你想要的。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.