[英]spark dataframe map aggregation with alias?
I like to use spark dataframe map aggregation syntax like this:我喜欢像这样使用 spark 数据帧映射聚合语法:
jaccardDf
.groupBy($"userId")
.agg(
"jaccardDistance"->"avg"
, "jaccardDistance"->"stddev_samp"
, "jaccardDistance"->"skewness"
, "jaccardDistance"->"kurtosis"
)
Is there a way to alias the resulting columns while still using the Map syntax?有没有办法在仍然使用 Map 语法的同时为结果列设置别名? When I need to alias I do this instead
当我需要别名时,我会这样做
jaccardDf
.groupBy($"userId")
.agg(
avg("jaccardDistance").alias("jaccardAvg")
,stddev_samp("jaccardDistance").alias("jaccardStddev")
,skewness("jaccardDistance").alias("jaccardSkewness")
,kurtosis("jaccardDistance").alias("jaccardKurtosis")
)
Use .toDF()
to alias your column names with a list you define:使用
.toDF()
使用您定义的列表为您的列名设置别名:
val colNames = Array("userId", "jaccardAvg", "jaccardStddev", "jaccardSkewness", "jaccardKurtosis")
jaccardDf
.groupBy($"userId")
.agg(
"jaccardDistance"->"avg",
"jaccardDistance"->"stddev_samp",
"jaccardDistance"->"skewness",
"jaccardDistance"->"kurtosis")
.toDF(colNames: _*)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.