简体   繁体   中英

Return max value of each row of a group of columns

I have a table of over 10,000 rows and over 400 columns. For columns containing at least the string 'xyz', I need to find the max value of each row (within these 'xyz' columns), and create 2 new columns.

The 1st new column would contain the max value of each row of these 'xyz' columns.

The 2nd new column would contain the column name from which the max value was retrieved. I'm stuck at creating the 2nd column. I've tried some stuff which doesn't work like;

Match = df[CompCol].isin[SpecList].all(axis=1)

How should approach the 2nd column?

another way using 'regex' and 'idmax.

    df = pd.DataFrame({'xyz1': [10, 20, 30, 40], 'xyz2': [11, 12,13,14],'xyz3':[1,2,3,44],'abc':[100,101,102,103]})

    df['maxval']= df.filter(regex='xyz').apply(max, axis=1)

    df['maxval_col'] = df.filter(regex='xyz').idxmax(axis=1)


    abc    xyz1  xyz2  xyz3  maxval   maxval_col
    100    10    11     1      11     xyz2
    101    20    12     2      20     xyz1
    102    30    13     3      30     xyz1
    103    40    14    44      44     xyz3

Does this work for you?

import pandas as pd
df = pd.DataFrame([(1,2,3,4),(2,1,1,4)], columns = ['xyz1','xyz2','xyz3','abc'])
cols = [k for k in df.columns if 'xyz' in k]

df['maxval'] = df[cols].apply(lambda s: max(zip(s, s.keys()))[0],1)
df['maxcol'] = df[cols].apply(lambda s: max(zip(s, s.keys()))[1],1)

df

Out[753]: 
   xyz1  xyz2  xyz3  abc  maxval maxcol
0     1     2     3    4       3   xyz3
1     2     1     1    4       2   xyz1

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM