How to save max row values for each column in python dataframe?

Question

I have a dataframe like:

   Name  A   B   C
0  Sen   1   0   NaN
1  Kes   0   1   0
2  Pas   0   0   1
3  Sen   0   0   NaN
4  Pas   0   0   2

I would like to drop duplicated for each column individually with a rule:

Name column is the key.

For example Sen is duplicated, but its value is changing only in A, for B & C its value is the same. So for A i would like to do an OR operation and retain Sen A's value as 1 and in the other row it should populate 'Nan'.

Basically i dont want to drop the entire row for duplication, but rather modify values inside each column for all columns.

Expected output:

   Name  A     B   C
0  Sen   1     0   NaN
1  Kes   0     1   0
2  Pas   0     0   Nan
3  Sen   Nan   0   NaN
4  Pas   0     0   2

Answer 1

We can do groupby + max with where

s=df.groupby('Name').max().reindex(df.Name).values
df.drop('Name',1).where(df.drop('Name',1)==s)
     A  B    C
0  1.0  0  NaN
1  0.0  1  0.0
2  0.0  0  NaN
3  NaN  0  NaN
4  0.0  0  2.0
#df.loc[:,'A':]=df.drop('Name',1).where(df.drop('Name',1)==s)

How to save max row values for each column in python dataframe?

Question

1 answers

solution1
1 2020-05-25 13:09:48

How to save max row values for each column in python dataframe?

Question

1 answers

solution1 1 2020-05-25 13:09:48

solution1
1 2020-05-25 13:09:48