Spark sql group by和求和更改列名称？

Question

In this data frame I am finding total salary from each group. 在此数据框中，我正在查找每个组的总工资。 In Oracle I'd use this code 在Oracle中，我将使用此代码

select job_id,sum(salary) as "Total" from hr.employees group by job_id;

In Spark SQL tried the same, I am facing two issues 在Spark SQL中尝试相同，我面临两个问题

empData.groupBy($"job_id").sum("salary").alias("Total").show()

The alias total is not displaying instead it is showing "sum(salary)" column 别名总计未显示，而是显示“ sum（salary）”列
I could not use $ (I think Scala SQL syntax). 我不能使用$ （我认为Scala SQL语法）。 Getting compilation issue 获取编译问题
```
  empData.groupBy($"job_id").sum($"salary").alias("Total").show() 
```

Any idea? 任何想法？

Answer 1

Use Aggregate function .agg() if you want to provide alias name. 如果要提供别名，请使用聚合函数.agg() 。 This accepts scala syntax ($" ") 这接受scala语法（$“”）

empData.groupBy($"job_id").agg(sum($"salary") as "Total").show()

If you dont want to use .agg() , alias name can be also be provided using .select() : 如果您不想使用.agg() ，也可以使用.select()提供别名：

empData.groupBy($"job_id").sum("salary").select($"job_id", $"sum(salary)".alias("Total")).show()

Spark sql group by和求和更改列名称？

问题描述

1 个解决方案

解决方案1
2 已采纳 2018-10-11 10:14:51

Spark sql group by和求和更改列名称？

问题描述

1 个解决方案

解决方案1 2 已采纳 2018-10-11 10:14:51

解决方案1
2 已采纳 2018-10-11 10:14:51