spark scala dataframe groupBy and orderBy

Question

我需要计算第一列和第二列中pair的出现次数并按降序排序。 如果计数中存在平局，请首先在第二列中列出具有最小数字的对。

以下工作，除了决胜局部分。 _c1 中的第一行应该是 1,2,3 bc 2 小于 4 并且它们的计数相同。 我如何按 count desc 和 c2 asc 排序？

new_df.groupBy($"_c0",$"_c1").count().orderBy($"count".desc).limit(10).show()

+---+---+-----+
|_c0|_c1|count|
+---+---+-----+
|  1|  4|    3|
|  1|  2|    3|
|  4|  1|    2|
|  3|  1|    2|
|  3|  4|    2|
|  2|  1|    2|
|  2|  4|    1|
|  1|  7|    1|
|  7|  2|    1|
|  2|  7|    1|
+---+---+-----+

Answer 1

尝试将 Desc 的 count 和 asc 的 _c2 添加到 order by 子句。

new_df.groupBy($"_c0",$"_c1").count().orderBy($"count".desc, $"c2".asc).limit(10).show()

按照您希望应用规则的顺序执行此操作。 在上面的例子中，它会先按count排序，然后是c2

spark scala dataframe groupBy and orderBy

问题描述

1 个解决方案

解决方案1
1 2019-10-17 23:50:23

spark scala dataframe groupBy and orderBy

问题描述

1 个解决方案

解决方案1 1 2019-10-17 23:50:23

解决方案1
1 2019-10-17 23:50:23