[英]Create a new column based on groupby a column value and count of another column in pandas?
I have a pandas dataframe我有一个熊猫数据框
df = pd.DataFrame({'Birds': ['Falcon','Falcon','Parrot','Peacock','Peacock'],
'Name': ['A', 'D', 'B', 'C', 'C']})
I need to create a new column,我需要创建一个新列,
df = pd.DataFrame({'Birds': ['Falcon','Falcon','Parrot','Peacock','Peacock'],
'Name': ['A', 'D', 'B', 'C', 'C']
'Count':['1','1','0','0','0'] })
Falcon has two names, so each records given 1, parrot and peacock has only one name which is B for parrot and C for peacock, so new column has 0. Falcon 有两个名字,所以每条记录都给了 1,parrot 和 peacock 只有一个名字,B 代表鹦鹉,C 代表孔雀,所以新列有 0。
I tried using groupby我尝试使用 groupby
df.groupby(['Birds','Name']).size()
this returns这返回
Birds Name
Falcon A 1
D 1
Parrot B 1
Peacock C 2
dtype: int64
Not sure how to convert this不知道如何转换这个
Another way, subset and drop duplicates另一种方式,子集并删除重复项
df2 = df.drop_duplicates(subset=['Birds', 'Name'], keep='first')
df2['Birds'].value_counts()
this returns这返回
Falcon 2
Peacock 1
Parrot 1
Name: Birds, dtype: int64
Not sure how to use this to create new column in original as 1 and 0不知道如何使用它在原来的 1 和 0 中创建新列
You can use transform
combined with nunique
:您可以将
transform
与nunique
结合使用:
df["count"] = df.groupby("Birds")["Name"].transform(lambda x: x.nunique() - 1)
Without lambda
- Option 1没有
lambda
- 选项 1
df["count"] = df.groupby("Birds")["Name"].transform("nunique") - pd.Series([1] * df.shape[0])
Without lambda
- Option 2没有
lambda
- 选项 2
df["count"] = df.groupby("Birds")["Name"].transform("nunique")
df["count"] = df["count"] -1
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.