简体   繁体   English

对多列中具有相同值的数据进行计数和分组

[英]Counting and grouping data that have the same value in several columns

I have a hard time doing this particular task: from the data below I want to count the elements that are equal to 0 in all Degrees and group them.我很难完成这个特定的任务:从下面的数据中,我想计算所有度数中等于 0 的元素并将它们分组。 In other words, how many X1 have all of the three degrees equal to 0?换句话说,有多少个 X1 的三个度数都等于 0?

Element | Degree A | Degree B | Degree C |
.............................................
 X1 |         0          0          0 
 X1 |         1          1          0
 X1 |         0          0          0 
 X2 |         1          0          1
 X2 |         0          0          0
 X2 |         0          0          0
 X3 |         0          0          0  
 X3 |         1          1          0
 X3 |         0          1          0

This is the desired output:这是所需的 output:

    Element     All=0 counts
................................
    X1           2
    X2           2
    X3           1

This is what I have tried:这是我尝试过的:

d1 = df.groupby(["Element"])["Degree A"].apply(lambda x: (x==0).sum())

I tried to add the other columns, but it doesn't work我尝试添加其他列,但它不起作用

d2 = df.groupby(["Element"])["Degree A"]&["Degree B"]&["Degree C"].apply(lambda x: (x==0).sum())

One reasonable way to do this is to make another column before the groupby:一种合理的方法是在 groupby 之前创建另一列:

df['check'] = df['Degree A'] + df["Degree B"] + df['Degree C'] == 0

df.groupby(['Element'])['check'].sum()

which gives the desired output.这给出了所需的 output。

You can sum the 'Degree' like columns and check whether their result equals to 0 using filter + sum(axis=1) + eq(0) .您可以对类似列的“度”求和,并使用filter + sum(axis=1) + eq(0)检查它们的结果是否等于 0。 This results in a boolean series which can be passed into loc which will return dataframe of only the rows that summed to 0.这导致boolean series可以传递到loc中,它将仅返回总和为 0 的行的 dataframe。

On this result you can use groupby.size() which will count the occurrence of each element:在此结果中,您可以使用groupby.size()来计算每个元素的出现次数:

df.loc[df.filter(like='Degree').sum(1).eq(0)]\
    .groupby('Element', as_index=False).size()

  Element  size
0      X1     2
1      X2     2
2      X3     1

Data sample:数据样本:

df = pd.DataFrame(
    {'Element': {0: 'X1',
  1: 'X1',
  2: 'X1',
  3: 'X2',
  4: 'X2',
  5: 'X2',
  6: 'X3',
  7: 'X3',
  8: 'X3'},
 'Degree A': {0: 0, 1: 1, 2: 0, 3: 1, 4: 0, 5: 0, 6: 0, 7: 1, 8: 0},
 'Degree B': {0: 0, 1: 1, 2: 0, 3: 0, 4: 0, 5: 0, 6: 0, 7: 1, 8: 1},
 'Degree C': {0: 0, 1: 0, 2: 0, 3: 1, 4: 0, 5: 0, 6: 0, 7: 0, 8: 0}}
    )

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM