[英]Using pandas to get max value per row and column header
I have a data frame and I am looking to get the max value for each row and the column header for the column where the max value is located and return a new dataframe.我有一个数据框,我希望获取每一行的最大值和最大值所在列的列标题,并返回一个新的数据框。 In reality my data frame has over 50 columns and over 30,000 rows:
实际上,我的数据框有 50 多列和 30,000 多行:
df1: df1:
ID Tis RNA DNA Prot Node Exv
AB 1.4 2.3 0.0 0.3 2.4 4.4
NJ 2.2 3.4 2.1 0.0 0.0 0.2
KL 0.0 0.0 0.0 0.0 0.0 0.0
JC 5.2 4.4 2.1 5.4 3.4 2.3
So the ideal output looks like this:所以理想的输出是这样的:
df2: df2:
ID
AB Exv 4.4
NJ RNA 3.4
KL N/A N/A
JC Prot 5.4
I have tried the following without any success:我尝试了以下方法但没有成功:
df2 = df1.max(axis=1)
result.index = df1.idxmax(axis=1)
also tried:也试过:
df2=pd.Series(df1.columns[np.argmax(df1.values,axis=1)])
final=pd.DataFrame(df1.lookup(s.index,s),s)
I have looked at other posts but still can't seem to solve this.我看过其他帖子,但似乎仍然无法解决这个问题。 Any help would be great
任何帮助都会很棒
Use if ID
is index DataFrame.agg
with replace 0
rows by missing values:如果
ID
是索引DataFrame.agg
使用缺失值替换0
行:
df = df1.agg(['idxmax','max'], axis=1).mask(lambda x: x['max'].eq(0))
print (df)
idxmax max
AB Exv 4.4
NJ RNA 3.4
KL NaN NaN
JC Prot 5.4
Use if ID
is column:如果
ID
是列,请使用:
df = df1.set_index('ID').agg(['idxmax','max'], axis=1).mask(lambda x: x['max'].eq(0))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.