简体   繁体   English

Pandas 增加一列表示第1和第2位,根据行值

[英]Pandas to add a column to indicate the 1st and 2nd places, according to row values

A data frame that I want to add a column to indicate, in each row, which "score" are ranked number 1 and number 2.我想添加一列以指示在每一行中哪个“分数”排名第 1 和第 2 的数据框。 在此处输入图像描述

import pandas as pd
from io import StringIO

csvfile = StringIO(
"""Name Department  A_score B_score C_score D_score
Jason   Finance 7   3   7   9
Jason   Sales   2   2   9   2
Molly   Operation   3   7   1   2
""")

df = pd.read_csv(csvfile, sep = '\t', engine='python')

# adding columns to indicate the ranks of A,B,C,D
df = df.join(df.rank(axis=1, ascending=False).astype(int).add_suffix('_rank'))

# returning the column headers that in [1, 2]
df_1 = df.apply(lambda x: x.isin([1,2]), axis=1).apply(lambda x: list(df.columns[x]), axis=1)

print (df_1)

# output as:
[A_score_rank, C_score_rank, D_score_rank]
[A_score, B_score, D_score, C_score_rank]
[C_score, D_score, A_score_rank, B_score_rank]

There are two problems有两个问题

  1. when checking which are the first and second places, it includes the "score" columns however I only want to run them by the "rank" columns检查哪些是第一和第二名时,它包括“分数”列但是我只想按“排名”列运行它们
  2. The df_1 comes as a separate data frame, not a part of the extended original data frame df_1 作为单独的数据帧出现,而不是扩展原始数据帧的一部分

How can I solve these?我该如何解决这些? Any helps our appreciated.任何帮助我们的赞赏。 Thank you.谢谢你。

We can do pd.Series.nlargest , then pull out the Not NaN one by notna and dot the column get the result我们可以做pd.Series.nlargest ,然后用notna拉出一个 Not NaNdot列得到结果

s = df.filter(like='score').apply(pd.Series.nlargest,n=2,keep='all',axis=1)
df['new'] = s.notna().dot(s.columns+',').str[:-1]
df
    Name Department  A_score  ...  C_score  D_score                      new
0  Jason    Finance        7  ...        7        9  A_score,C_score,D_score
1  Jason      Sales        3  ...        9        2          A_score,C_score
2  Molly  Operation        3  ...        1        2          A_score,B_score
[3 rows x 7 columns]

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 将 pandas 数据框列中的每个值与第二个数据框列的所有值相乘并将每个第一个数据框值替换为结果数组 - Multiply each value in a pandas dataframe column with all values of 2nd dataframe column & replace each 1st dataframe value with resulting array 如果第一列值相同,MatLab(或任何其他语言)转换矩阵或csv以将第二列值放到同一行? - MatLab (or any other language) to convert a matrix or a csv to put 2nd column values to the same row if 1st column value is the same? 使用beautifulsoup4(第2行,第1列和第6列)从html表中提取值 - extracting values from html table using beautifulsoup4 (2nd row onwards, 1st and 6th column) 合并多个文件:第一列(相同的字符串),第二列(每个文件的唯一值) - Merge multiple files: 1st Column (same string), 2nd Column (unique values per file) 如何强制 pandas dataframe 的第 2 级加起来达到第 1 级? - How to enforce 2nd level of pandas dataframe to add up to 1st level? 根据第二列中存在的字符串更新第一列 - Update 1st column based on string present in 2nd column pandas 将第一个多索引转换为行索引,将第二个多索引转换为列索引 - pandas transform 1st mutliindex to rowindex and 2nd multiindex to columnindex 如果第一列匹配,则提取具有第 11 列值的行位于第二个文件的第 2 和第 3 之间 - Extract rows having the 11th column values lies between 2nd and 3nd of a second file if 1st column matches seaborn gridplot/subplots 在第一列显示一个变量,在第二列显示另一个变量,用于同一行 ID - seaborn gridplot/subplots to show one variable on the 1st column and another variable on 2nd column for the same row ID 从单个 Pandas 列中取出第一和第二、第四和第五等行并放入两个新列 Python - Taking the 1st and 2nd, 4th and 5th etc rows from a single Pandas column and put in two new columns, Python
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM