简体   繁体   English

在 spark/scala 的另一个数据框中查找多个列值

[英]look for multiple column values in another data frame in spark/scala

I am having 2 data frames A & B.我有 2 个数据框 A 和 B。

A is having 30 columns- reason1,reason2.......reason30 A 有 30 列 - reason1,reason2.......reason30

B having 2 columns- reason, Value B 有 2 列 - reason, Value

Now I need to look for all the columns starting with reason* into B and fetch the corresponding value in one column in data frame A.现在我需要在 B 中查找以 reason* 开头的所有列,并在数据框 A 的一列中获取相应的值。

So, the Final data frame would have reason1,reason2.......reason30, value因此,最终数据帧将具有reason1,reason2.......reason30, value

I was trying to join each column with other data frame but that's not an neat approach.我试图将每一列与其他数据框连接起来,但这不是一种巧妙的方法。

Please help me to get an optimized and fast solution using spark/scala.请帮助我使用 spark/scala 获得优化和快速的解决方案。

You can get the Array of columns of a dataframe using您可以使用 dataframe 获取列数组

df.columns 

Then, you can iterate over the Array and check which column is present and prepare the join condition然后,您可以遍历 Array 并检查存在哪一列并准备连接条件

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM