简体   繁体   中英

Finding the maximum difference for a subset of columns with pandas

I have a dataframe:

   A   B    C   D   E
0  a  34   55  43  aa
1  b  53   77  65  bb
2  c  23  100  34  cc
3  d  54   43  23  dd
4  e  23   67  54  ee
5  f  43   98  23  ff

I need to get the maximum difference between the column B,C and D and return the value in column A . in row 'a' maximum difference between columns is 55 - 34 = 21 . data is in a dataframe.

The expected result is

    A   B    C   D   E
0  21  34   55  43  aa
1  24  53   77  65  bb
2  77  23  100  34  cc
3  31  54   43  23  dd
4  44  23   67  54  ee
5  75  43   98  23  ff

Use np.ptp :

# df['A'] = np.ptp(df.loc[:, 'B':'D'], axis=1)
df['A'] = np.ptp(df[['B', 'C', 'D']], axis=1)
df

    A   B    C   D   E
0  21  34   55  43  aa
1  24  53   77  65  bb
2  77  23  100  34  cc
3  31  54   43  23  dd
4  44  23   67  54  ee
5  75  43   98  23  ff

Or, find the max and min yourself:

df['A'] = df[['B', 'C', 'D']].max(1) - df[['B', 'C', 'D']].min(1)
df

    A   B    C   D   E
0  21  34   55  43  aa
1  24  53   77  65  bb
2  77  23  100  34  cc
3  31  54   43  23  dd
4  44  23   67  54  ee
5  75  43   98  23  ff

If performance is important, you can do this in NumPy space:

v = df[['B', 'C', 'D']].values
df['A'] = v.max(1) - v.min(1)
df

    A   B    C   D   E
0  21  34   55  43  aa
1  24  53   77  65  bb
2  77  23  100  34  cc
3  31  54   43  23  dd
4  44  23   67  54  ee
5  75  43   98  23  ff

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM