[英]How to sum columns and then sort them in pandas python
Below is a sample DataFrame, The data is in .csv file.下面是一个示例 DataFrame,数据在 .csv 文件中。
EPISODE_Number EPISODE_TITLE object1 object2 object3 object4 object5
0 S01E01 A 1 1 0 0 0
1 S01E02 B 0 0 0 1 0
2 S01E03 C 1 1 0 0 1
3 S01E04 D 0 1 1 1 0
4 S01E05 E 1 0 0 1 0
5 S01E06 F 1 1 0 1 1
6 S01E07 G 0 0 0 1 1
7 S01E08 H 0 1 0 0 0
8 S01E09 I 1 1 0 1 1
9 S01E10 J 0 1 1 0 0
I would like to have the sum of each object and then sort the objects from bigger to smaller (top 10 only)我想得到每个对象的总和,然后将对象从大到小排序(仅限前 10 名)
Below is my code so far:以下是我到目前为止的代码:
import pandas as pd
data = pd.read_csv("TV_show.csv")
sume_s = data[data.sum(0).sort_values(ascending=False)[2:6].index]
output should be like below:输出应如下所示:
object2: 7
object4: 6
object1: 5
object5: 4
object3: 2
But I'm getting the following error:但我收到以下错误:
indexer = non_nan_idx[non_nans.argsort(kind=kind)]
TypeError: '>' not supported between instances of 'numpy.ndarray' and 'str'
Use DataFrame.iloc
with sum and for top 10 add Series.nlargest
:将
DataFrame.iloc
与 sum 一起使用,并为前 10 名添加Series.nlargest
:
sume_a = data.iloc[:, 2:7].sum().nlargest(10)
print (sume_a)
object2 7
object4 6
object1 5
object5 4
object3 2
dtype: int64
Working like comment solution:像评论解决方案一样工作:
sume_a = data.iloc[:, 2:7].sum().sort_values(ascending=False).head(10)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.