[英]Coalesce columns in spark dataframe
I did an algorithm and I got a lot of columns with the name logic and number suffix , I need to do coalesce
but I don't know how to apply coalesce
with different amount of columns.我做了一个算法,我得到了很多名称为logic和number suffix的列,我需要进行
coalesce
但我不知道如何对不同数量的列应用coalesce
。
Example:例子:
|id|logic_01|logic_02|logic_03|
|1 | null |a |null |
|2 | null |b |null |
|3 | c | null |null |
|4 | null | null |d |
Response:回复:
|id|logic|
|1 | a |
|2 | b |
|3 | c |
|4 | d |
Another example:另一个例子:
|id|logic_01|logic_02|logic_03|logic_04|
|1 | null |a |null |null |
|2 | null | null |null |b |
|3 | c | null |null |null |
|4 | null | null |d |null |
Response:回复:
|id|logic|
|1 | a |
|2 | b |
|3 | c |
|4 | d |
Thanks for your help.谢谢你的帮助。
First find all columns that you want to use in the coalesce
:首先找到要在
coalesce
使用的所有列:
val cols = df.columns.filter(_.startsWith("logic")).map(col(_))
Then perform the actual coalesce
:然后执行实际的
coalesce
:
df.select($"id", coalesce(cols: _*).as("logic"))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.