简体   繁体   English

基于groupby一个列值和pandas中另一列的计数创建一个新列?

[英]Create a new column based on groupby a column value and count of another column in pandas?

I have a pandas dataframe我有一个熊猫数据框

df = pd.DataFrame({'Birds': ['Falcon','Falcon','Parrot','Peacock','Peacock'],
                   'Name': ['A', 'D', 'B', 'C', 'C']})

I need to create a new column,我需要创建一个新列,

df = pd.DataFrame({'Birds': ['Falcon','Falcon','Parrot','Peacock','Peacock'],
                   'Name': ['A', 'D', 'B', 'C', 'C']
                   'Count':['1','1','0','0','0'] })

Falcon has two names, so each records given 1, parrot and peacock has only one name which is B for parrot and C for peacock, so new column has 0. Falcon 有两个名字,所以每条记录都给了 1,parrot 和 peacock 只有一个名字,B 代表鹦鹉,C 代表孔雀,所以新列有 0。

I tried using groupby我尝试使用 groupby

df.groupby(['Birds','Name']).size()

this returns这返回

Birds    Name
Falcon   A        1
         D        1
Parrot   B        1
Peacock  C        2
dtype: int64

Not sure how to convert this不知道如何转换这个

Another way, subset and drop duplicates另一种方式,子集并删除重复项

df2 = df.drop_duplicates(subset=['Birds', 'Name'], keep='first')
df2['Birds'].value_counts()

this returns这返回

Falcon     2
Peacock    1
Parrot     1
Name: Birds, dtype: int64

Not sure how to use this to create new column in original as 1 and 0不知道如何使用它在原来的 1 和 0 中创建新列

You can use transform combined with nunique :您可以将transformnunique结合使用:

df["count"] =  df.groupby("Birds")["Name"].transform(lambda x: x.nunique() - 1)

Without lambda - Option 1没有lambda - 选项 1

df["count"] =  df.groupby("Birds")["Name"].transform("nunique") - pd.Series([1] * df.shape[0])

Without lambda - Option 2没有lambda - 选项 2

df["count"] =  df.groupby("Birds")["Name"].transform("nunique")
df["count"] =  df["count"] -1

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM