[英]look for multiple column values in another data frame in spark/scala
I am having 2 data frames A & B.我有 2 个数据框 A 和 B。
A is having 30 columns- reason1,reason2.......reason30
A 有 30 列 -
reason1,reason2.......reason30
B having 2 columns- reason, Value
B 有 2 列 -
reason, Value
Now I need to look for all the columns starting with reason* into B and fetch the corresponding value in one column in data frame A.现在我需要在 B 中查找以 reason* 开头的所有列,并在数据框 A 的一列中获取相应的值。
So, the Final data frame would have reason1,reason2.......reason30, value
因此,最终数据帧将具有
reason1,reason2.......reason30, value
I was trying to join each column with other data frame but that's not an neat approach.我试图将每一列与其他数据框连接起来,但这不是一种巧妙的方法。
Please help me to get an optimized and fast solution using spark/scala.请帮助我使用 spark/scala 获得优化和快速的解决方案。
You can get the Array of columns of a dataframe using您可以使用 dataframe 获取列数组
df.columns
Then, you can iterate over the Array and check which column is present and prepare the join condition然后,您可以遍历 Array 并检查存在哪一列并准备连接条件
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.