返回一组列中每一行的最大值

Question

I have a table of over 10,000 rows and over 400 columns. 我有超过10,000行和超过400列的表。 For columns containing at least the string 'xyz', I need to find the max value of each row (within these 'xyz' columns), and create 2 new columns. 对于至少包含字符串“ xyz”的列，我需要找到每行的最大值（在这些“ xyz”列内），并创建2个新列。

The 1st new column would contain the max value of each row of these 'xyz' columns. 新的第一列将包含这些“ xyz”列每一行的最大值。

The 2nd new column would contain the column name from which the max value was retrieved. 新的第二列将包含从中检索最大值的列名称。 I'm stuck at creating the 2nd column. 我被困在创建第二列。 I've tried some stuff which doesn't work like; 我尝试了一些不起作用的东西；

Match = df[CompCol].isin[SpecList].all(axis=1)

How should approach the 2nd column? 应该如何接近第二列？

Answer 1

another way using 'regex' and 'idmax. 使用“ regex”和“ idmax”的另一种方式。

    df = pd.DataFrame({'xyz1': [10, 20, 30, 40], 'xyz2': [11, 12,13,14],'xyz3':[1,2,3,44],'abc':[100,101,102,103]})

    df['maxval']= df.filter(regex='xyz').apply(max, axis=1)

    df['maxval_col'] = df.filter(regex='xyz').idxmax(axis=1)


    abc    xyz1  xyz2  xyz3  maxval   maxval_col
    100    10    11     1      11     xyz2
    101    20    12     2      20     xyz1
    102    30    13     3      30     xyz1
    103    40    14    44      44     xyz3

Answer 2

Does this work for you? 这对您有用吗？

import pandas as pd
df = pd.DataFrame([(1,2,3,4),(2,1,1,4)], columns = ['xyz1','xyz2','xyz3','abc'])
cols = [k for k in df.columns if 'xyz' in k]

df['maxval'] = df[cols].apply(lambda s: max(zip(s, s.keys()))[0],1)
df['maxcol'] = df[cols].apply(lambda s: max(zip(s, s.keys()))[1],1)

df

Out[753]: 
   xyz1  xyz2  xyz3  abc  maxval maxcol
0     1     2     3    4       3   xyz3
1     2     1     1    4       2   xyz1

返回一组列中每一行的最大值

问题描述

2 个解决方案

解决方案1
3 2015-05-12 06:00:28

解决方案2
0 已采纳 2015-05-12 04:42:18

返回一组列中每一行的最大值

问题描述

2 个解决方案

解决方案1 3 2015-05-12 06:00:28

解决方案2 0 已采纳 2015-05-12 04:42:18

解决方案1
3 2015-05-12 06:00:28

解决方案2
0 已采纳 2015-05-12 04:42:18