按值排序 dataframe 返回“对于多索引，label 必须是一个元组，其中元素对应于每个级别。”

Question

Objective: Based off dataframe with 5 columns, return dataframe with 3 columns including one which is the count and sort by largest count to smallest. Objective: 基于 dataframe 的 5 列，返回 dataframe 的 3 列，其中一列是计数，并按从最大到最小的计数排序。

What I have tried:我试过的：

df = df[['Country', 'Year','NumInstances']].groupby(['Country', 'Year']).agg(['count'])

df = df.sort_values(by='NumInstances', ascending=False)

print(df)

Error: ValueError: The column label 'NumInstances' is not unique.错误：ValueError：列 label 'NumInstances' 不是唯一的。 For a multi-index, the label must be a tuple with elements corresponding to each level.对于多索引，label 必须是一个元组，其元素对应于每个级别。

Before this gets mark as a duplicate, I have gone through all other suggested duplicates and it seems they all suggest using the same code as I have above.在这被标记为重复之前，我已经浏览了所有其他建议的重复，似乎它们都建议使用与我上面相同的代码。

Is there something small that I am doing that may be incorrect?我正在做的一些小事情可能不正确吗？

Thanks!谢谢！

Answer 1

I guess you need to remove multi-index -我想你需要删除多索引 -

Try this -尝试这个 -

df = df[['Country', 'Year','NumInstances']].groupby(['Country', 'Year']).agg(['count']).reset_index()

or -或者 -

df = df[['Country', 'Year','NumInstances']].groupby(['Country', 'Year'], as_index=False).agg(['count'])

Answer 2

Found the issue.发现问题。 Adding an agg to the NumInstances column made the NumInstances column name a tuple of ('NumInstances', 'sum'), therefore I just updated the sort code to:向 NumInstances 列添加 agg 使 NumInstances 列名称成为 ('NumInstances', 'sum') 的元组，因此我刚刚将排序代码更新为：

df = df.sort_values(by=('NumInstances', 'sum'), ascending=False)

按值排序 dataframe 返回“对于多索引，label 必须是一个元组，其中元素对应于每个级别。”

问题描述

2 个解决方案

解决方案1
0 2021-04-17 18:31:41

解决方案2
0 2021-04-17 18:58:17

按值排序 dataframe 返回“对于多索引，label 必须是一个元组，其中元素对应于每个级别。”

问题描述

2 个解决方案

解决方案1 0 2021-04-17 18:31:41

解决方案2 0 2021-04-17 18:58:17

解决方案1
0 2021-04-17 18:31:41

解决方案2
0 2021-04-17 18:58:17