I have the following df
,
inv_id cluster_id
793 2
2
789 3
789 3
4
4
I like to groupby
cluster_id
and check how many unique values each group has,
df['same_inv_id'] = df.groupby('cluster_id')['inv_id'].transform('nunique') == 1
but I like to set same_inv_id = False
when some cluster only contains empty/blank inv_id
, and when some cluster contains one or more empty/blank inv_id
, so the result will look like,
inv_id cluster_id same_inv_id
793 2 False
2 False
789 3 True
789 3 True
4 False
4 False
IIUC get the condition then transform
+ all
s1=df.inv_id.ne('').groupby(df.cluster_id).transform('all')
s1
Out[432]:
0 False
1 False
2 True
3 True
4 False
5 False
Name: inv_id, dtype: bool
s2=df.groupby('cluster_id')['inv_id'].transform('nunique') == 1
#df['same_inv_id']=s1&s2
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.