熊猫groupby在多列

Question

I have a data set which contains state code and its status. 我有一个包含状态代码及其状态的数据集。

  code  status
1   AZ  a
2   CA  b
3   KS  c
4   MO  c
5   NY  d
6   AZ  d
7   MO  a
8   MO  b
9   MN  b
10  NV  a
11  NV  e
12  MO  f
13  NY  a
14  NY  a
15  NY  b

I want to filter out this data set which code contains only a status and count how many they have. 我想过滤出该数据集，其中哪些代码仅包含a状态并计算它们的数量。 Example output will be, 示例输出将是，

  code  status  
1   AZ  a   
2   MO  a   
3   NY  a   

    AZ =1   MO = 1  NY =2

I used df.groupyby("code").loc[df.status == 'a'] but didn't have any luck. 我使用了df.groupyby("code").loc[df.status == 'a']但没有任何运气。 Any help appreciated! 任何帮助表示赞赏！

Answer 1

Let's filter the dataframe first for a, then groupby and count. 让我们首先为a过滤数据帧，然后对groupby进行计数。

df[df.status == 'a'].groupby('code').size()

Output: 输出：

code
AZ    1
MO    1
NV    1
NY    2
dtype: int64

Answer 2

I've recreated your dataset 我已经重新创建了您的数据集

data = [["AZ","CA", "KS","MO","NY","AZ","MO","MO","MN","NV","NV","MO","NY","NY" ,"NY"],
       ["a","b","c","c","d","d","a","b","b","a","e","f","a","a","b"]]


df = pd.DataFrame(data)
df = df.T
df.columns = ["code","status" ]

df[df["status"] == "a"].groupby(["code", "status"]).size()

gives 给

code  status
AZ    a         1
MO    a         1
NV    a         1
NY    a         2
dtype: int64

熊猫groupby在多列

问题描述

2 个解决方案

解决方案1
2 已采纳 2017-11-30 23:44:25

解决方案2
0 2017-11-30 23:57:06

熊猫groupby在多列

问题描述

2 个解决方案

解决方案1 2 已采纳 2017-11-30 23:44:25

解决方案2 0 2017-11-30 23:57:06

解决方案1
2 已采纳 2017-11-30 23:44:25

解决方案2
0 2017-11-30 23:57:06