简体   繁体   English

熊猫groupby在多列

[英]pandas groupby on multiple columns

I have a data set which contains state code and its status. 我有一个包含状态代码及其状态的数据集。

  code  status
1   AZ  a
2   CA  b
3   KS  c
4   MO  c
5   NY  d
6   AZ  d
7   MO  a
8   MO  b
9   MN  b
10  NV  a
11  NV  e
12  MO  f
13  NY  a
14  NY  a
15  NY  b

I want to filter out this data set which code contains only a status and count how many they have. 我想过滤出该数据集,其中哪些代码仅包含a状态并计算它们的数量。 Example output will be, 示例输出将是,

  code  status  
1   AZ  a   
2   MO  a   
3   NY  a   

    AZ =1   MO = 1  NY =2

I used df.groupyby("code").loc[df.status == 'a'] but didn't have any luck. 我使用了df.groupyby("code").loc[df.status == 'a']但没有任何运气。 Any help appreciated! 任何帮助表示赞赏!

Let's filter the dataframe first for a, then groupby and count. 让我们首先为a过滤数据帧,然后对groupby进行计数。

df[df.status == 'a'].groupby('code').size()

Output: 输出:

code
AZ    1
MO    1
NV    1
NY    2
dtype: int64

I've recreated your dataset 我已经重新创建了您的数据集

data = [["AZ","CA", "KS","MO","NY","AZ","MO","MO","MN","NV","NV","MO","NY","NY" ,"NY"],
       ["a","b","c","c","d","d","a","b","b","a","e","f","a","a","b"]]


df = pd.DataFrame(data)
df = df.T
df.columns = ["code","status" ]

df[df["status"] == "a"].groupby(["code", "status"]).size()

gives

code  status
AZ    a         1
MO    a         1
NV    a         1
NY    a         2
dtype: int64

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM