合并火花数据框中的列

Question

I did an algorithm and I got a lot of columns with the name logic and number suffix , I need to do coalesce but I don't know how to apply coalesce with different amount of columns.我做了一个算法，我得到了很多名称为logic和number suffix的列，我需要进行coalesce但我不知道如何对不同数量的列应用coalesce 。

Example:例子：

|id|logic_01|logic_02|logic_03|
|1 |  null  |a       |null    |  
|2 |  null  |b       |null    |   
|3 |   c    | null   |null    |   
|4 |  null  | null   |d       |

Response:回复：

|id|logic|
|1 |  a  |  
|2 |  b  |   
|3 |  c  |   
|4 |  d  |

Another example:另一个例子：

|id|logic_01|logic_02|logic_03|logic_04|
|1 |  null  |a       |null    |null    |  
|2 |  null  | null   |null    |b       |   
|3 |   c    | null   |null    |null    |    
|4 |  null  | null   |d       |null    |

Response:回复：

|id|logic|
|1 |  a  |  
|2 |  b  |   
|3 |  c  |   
|4 |  d  |

Thanks for your help.谢谢你的帮助。

Answer 1

First find all columns that you want to use in the coalesce :首先找到要在coalesce使用的所有列：

val cols = df.columns.filter(_.startsWith("logic")).map(col(_))

Then perform the actual coalesce :然后执行实际的coalesce ：

df.select($"id", coalesce(cols: _*).as("logic"))

合并火花数据框中的列

问题描述

1 个解决方案

解决方案1
12 已采纳 2018-06-21 03:27:12

合并火花数据框中的列

问题描述

1 个解决方案

解决方案1 12 已采纳 2018-06-21 03:27:12

解决方案1
12 已采纳 2018-06-21 03:27:12