简体   繁体   中英

How to save max row values for each column in python dataframe?

I have a dataframe like:

   Name  A   B   C
0  Sen   1   0   NaN
1  Kes   0   1   0
2  Pas   0   0   1
3  Sen   0   0   NaN
4  Pas   0   0   2

I would like to drop duplicated for each column individually with a rule:

Name column is the key.

For example Sen is duplicated, but its value is changing only in A, for B & C its value is the same. So for A i would like to do an OR operation and retain Sen A's value as 1 and in the other row it should populate 'Nan'.

Basically i dont want to drop the entire row for duplication, but rather modify values inside each column for all columns.

Expected output:

   Name  A     B   C
0  Sen   1     0   NaN
1  Kes   0     1   0
2  Pas   0     0   Nan
3  Sen   Nan   0   NaN
4  Pas   0     0   2

We can do groupby + max with where

s=df.groupby('Name').max().reindex(df.Name).values
df.drop('Name',1).where(df.drop('Name',1)==s)
     A  B    C
0  1.0  0  NaN
1  0.0  1  0.0
2  0.0  0  NaN
3  NaN  0  NaN
4  0.0  0  2.0
#df.loc[:,'A':]=df.drop('Name',1).where(df.drop('Name',1)==s)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM