在SparkR中合並兩列

Question

在SparkR中合並兩列的簡單方法是什么？ 考慮以下Spark DF：

salary_from  salary_to  position
1500         null       a
null         1300       b
800          1000       c

我想將salary列與這種邏輯結合起來。 從salary_from和salary_to取一個不為null的值，如果兩者都存在，則取一個中間值。

salary_from  salary_to  position  salary
1500         null       a         1500
null         1300       b         1300
800          1000       c         900

是否有辦法遍歷每一行並應用我的邏輯，就像我在R中使用apply方法一樣？

Answer 1

您可以使用coalesce功能：

withColumn(
  sdf, "salary",
  expr("coalesce((salary_from + salary_to) / 2, salary_from, salary_to)")
)

返回第一個非空表達式。

在SparkR中合並兩列

問題描述

1 個解決方案

解決方案1
1 已采納 2016-04-04 15:38:00

在SparkR中合並兩列

問題描述

1 個解決方案

解決方案1 1 已采納 2016-04-04 15:38:00

解決方案1
1 已采納 2016-04-04 15:38:00