简体   繁体   English

使用 Pandas 获取每行和列标题的最大值

[英]Using pandas to get max value per row and column header

I have a data frame and I am looking to get the max value for each row and the column header for the column where the max value is located and return a new dataframe.我有一个数据框,我希望获取每一行的最大值和最大值所在列的列标题,并返回一个新的数据框。 In reality my data frame has over 50 columns and over 30,000 rows:实际上,我的数据框有 50 多列和 30,000 多行:

df1: df1:

ID   Tis   RNA   DNA   Prot   Node   Exv     
AB   1.4   2.3   0.0   0.3   2.4   4.4
NJ   2.2   3.4   2.1   0.0   0.0   0.2
KL   0.0   0.0   0.0   0.0   0.0   0.0
JC   5.2   4.4   2.1   5.4   3.4   2.3

So the ideal output looks like this:所以理想的输出是这样的:

df2: df2:

ID  
AB   Exv   4.4
NJ   RNA   3.4
KL   N/A    N/A
JC   Prot   5.4

I have tried the following without any success:我尝试了以下方法但没有成功:

df2 = df1.max(axis=1)
result.index = df1.idxmax(axis=1)

also tried:也试过:

df2=pd.Series(df1.columns[np.argmax(df1.values,axis=1)])
final=pd.DataFrame(df1.lookup(s.index,s),s)

I have looked at other posts but still can't seem to solve this.我看过其他帖子,但似乎仍然无法解决这个问题。 Any help would be great任何帮助都会很棒

Use if ID is index DataFrame.agg with replace 0 rows by missing values:如果ID是索引DataFrame.agg使用缺失值替换0行:

df = df1.agg(['idxmax','max'], axis=1).mask(lambda x: x['max'].eq(0))
print (df)
   idxmax  max
AB    Exv  4.4
NJ    RNA  3.4
KL    NaN  NaN
JC   Prot  5.4

Use if ID is column:如果ID是列,请使用:

df = df1.set_index('ID').agg(['idxmax','max'], axis=1).mask(lambda x: x['max'].eq(0))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM