返回新列中每一行中第一个匹配值的列名

Question

I have a dataframe where the first column is an ID and each other column is a date.我有一个 dataframe ，其中第一列是 ID，其他列是日期。 Each ID may show the same thing in several columns, may have some leading NaN columns, or may have all NaN columns.每个 ID 可能在几列中显示相同的内容，可能有一些前导 NaN 列，或者可能有所有 NaN 列。 I'd like to create a new column with the name of the column where a specific entry first appears.我想使用首次出现特定条目的列的名称创建一个新列。

sample df:样本df：

| id_report | req id | 1-Jan | 2-Jan | 3-Jan | 4-Jan |
| --------- | -------------- | ----- | ----- | ----- | ----- |
| 0   | 12345 | NaN | Pend | Pend | Appr |
| 1   | 12346  | NaN | NaN | NaN | NaN |
| 2   | 12347 | NaN | NaN | Pend | Pend |
| 3   | 12348  | NaN | NaN | NaN | Appr |

I've searched and come up with:我已经搜索并想出了：

id_report["Pend"] = id_report.apply(lambda x: x == "Pend", axis = 1).idxmax(axis = 1)

But this returns "req id" for every row where "Pend" doesn't appear, and I'd like to keep those positions empty.但这会为没有出现“Pend”的每一行返回“req id”，我想将这些位置保持为空。

Desired output:所需的 output：

id_report id_report	req id请求编号	1-Jan 1-1月	2-Jan 1月2日	3-Jan 1月3日	4-Jan 1 月 4 日	Pend挂起
0 0	12345 12345	NaN钠	Pend挂起	Pend挂起	Appr应用程序	2-Jan 1月2日
1 1	12346 12346	NaN钠	NaN钠	NaN钠	NaN钠	NaN钠
2 2	12347 12347	NaN钠	NaN钠	Pend挂起	Pend挂起	3-Jan 1月3日
3 3	12348 12348	NaN钠	NaN钠	NaN钠	Appr应用程序	NaN钠

Answer 1

You could chain a replace to your current code:您可以将replace链接到当前代码：

import numpy as np
id_report['Pend'] = (id_report
   .apply(lambda x: x == 'Pend', axis = 1)
   .idxmax(axis = 1)
   .replace('req id', np.nan)
)

	req id请求编号	1-Jan 1-1月	2-Jan 1月2日	3-Jan 1月3日	4-Jan 1 月 4 日	Pend挂起
0 0	12345 12345	NaN钠	Pend挂起	Pend挂起	Appr应用程序	2-Jan 1月2日
1 1	12346 12346	NaN钠	NaN钠	NaN钠	NaN钠	NaN钠
2 2	12347 12347	NaN钠	NaN钠	Pend挂起	Pend挂起	3-Jan 1月3日
3 3	12348 12348	NaN钠	NaN钠	NaN钠	Appr应用程序	NaN钠

返回新列中每一行中第一个匹配值的列名

问题描述

1 个解决方案

解决方案1
0 2021-03-08 21:15:02

返回新列中每一行中第一个匹配值的列名

问题描述

1 个解决方案

解决方案1 0 2021-03-08 21:15:02

解决方案1
0 2021-03-08 21:15:02