简体   繁体   English

如何汇总熊猫中几列的所有类别变量的总和

[英]How to sum the total of all categorical variables across several columns in Pandas

I have a dataset that looks something like this: 我有一个看起来像这样的数据集:

id AttA AttB AttC
1   Y         Y
2        Y    

I would like to create another column which has the total number of attributes for each case, as follows: 我想创建另一列,其中包含每种情况下的属性总数,如下所示:

id AttA AttB AttC TotalAtts
1   Y         Y     2
2        Y          1

It's not obvious to me how I should approach this problem, since I'm fairly new to Pandas. 对我来说,如何解决这个问题并不明显,因为我对Pandas并不陌生。

Thanks in advance 提前致谢

You could check which cells in the dataframe are not empty with ne('') , and take the sum setting axis to 1 : 您可以使用ne('')检查数据框中的哪些单元格不为空,并将sum设置axis设为1

df['TotalAtts'] = df.ne('').sum(1)

   AttA AttB AttC  TotalAtts
0    Y         Y          2
1         Y               1

If you want the count of Y , you can do (df == 'Y').sum(1) . 如果想要Y的计数,则可以执行(df == 'Y').sum(1) If you want to count non-null values, then you can do df.count(1) , but empty strings will be counted by this. 如果要计算非空值,则可以执行df.count(1) ,但是空字符串将由此计数。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM