如何将 append 一个 dataframe 变成另一个 dataframe 作为一列

Question

May be my issue description is wrong but what is described below.可能是我的问题描述是错误的，但如下所述。 I need solution in pyspark.我需要 pyspark 中的解决方案。

I have 2 data frames我有 2 个数据框

Df1
A B C
1 2 3
5 6 7
8 9 1
6 2 3

Df2
D E
a b
c d
e f

I want final dataframe as below我想要最终的 dataframe 如下

A B C D E
1 2 3 a b
1 2 3 c d
1 2 3 e f
5 6 7 a b
5 6 7 c d
5 6 7 e f
8 9 1 a b
8 9 1 c d
8 9 1 e f
6 2 3 a b
6 2 3 c d
6 2 3 e f

Basically new dataframe will be for each row for DF1 will repeat for each row of DF2.基本上新的 dataframe 将针对 DF1 的每一行将针对 DF2 的每一行重复。 Final count would be: count(Df1) * count(Df2)最终计数为： count(Df1) * count(Df2)

Please help, I am new to pysaprk.请帮忙，我是pysaprk的新手。

Answer 1

You can use crossJoin , which is df.crossJoin(df2) .您可以使用crossJoin ，即df.crossJoin(df2) 。 You can check this https://spark.apache.org/docs/3.1.1/api/python/reference/api/pyspark.sql.DataFrame.crossJoin.html . You can check this https://spark.apache.org/docs/3.1.1/api/python/reference/api/pyspark.sql.DataFrame.crossJoin.html .

如何将 append 一个 dataframe 变成另一个 dataframe 作为一列

问题描述

1 个解决方案

解决方案1
0 2022-08-06 12:23:41

如何将 append 一个 dataframe 变成另一个 dataframe 作为一列

问题描述

1 个解决方案

解决方案1 0 2022-08-06 12:23:41

解决方案1
0 2022-08-06 12:23:41