在 pandas 中查找先前匹配单元格的索引

Question

我已经有一段时间没有使用 python 了，如果这是一个愚蠢的问题，我深表歉意！

我有一个大面板数据集，即多个 ID 需要多天。 让我们称之为data1

我还有data2 ，它是某个类别中的 ID 列表。

我想要：

从data1中的data2获取每个 ID 观察到的最后一天
获取data1中与该行对应的列code中的值

到目前为止，我所拥有的是：

for i in data2.id.unique():
    last_day = data1[data1["ID"]==i]["datestamp"]
    code = data1[(data1["ID"==i])&(data1["datestamp"]==last_day)]["code"]

编辑：我想出了一个合并两者的代码，所以现在新的数据集看起来像这样：

ID | length | code | payments
01 | 230    | AAA  | 1
02 | 106    | BBB  | 4
03 | 128    | CCC  | 2
04 | 96     | AAA  | 3
05 | 205    | CCC  | 5

其中 length 是客户在公司工作的天数。

基本上我想说的是，当代码是 AAA 或 CCC 时，新列new取长度值，当它不是 AAA 或 CCC 时取 0。

我试过这样做：

df['new']=[df['length'] for x in df['code'] if x in ["AAA","CCC"]]

但这没有用。 然后我这样尝试：

hello=[df['length'] for x in df['code'] if x in ["AAA","CCC"]]

它有效，但每次满足条件时它都会返回完整的系列df["length"] 。 我不确定如何制作，以便如果满足条件，则应应用length值。

Answer 1

我认为您希望根据code中的 value 将length的副本转换为new 。

IIUC，你想要这个。

import pandas as pd
c = ['ID','length','code']
d = [['01',230,'AAA'],
['02',106,'BBB'],
['03',128,'CCC'],
['04',96,'AAA'],
['05',205,'CCC']]

df = pd.DataFrame(d,columns=c)
print (df)

df['new'] = df.apply(lambda x: x['length'] if x['code'] in ['AAA','CCC'] else 0, axis=1)
print (df)

使用axis=1 ，它将逐行处理逻辑。 它会将结果发送到df['new'] 。

原装DataFrame：

   ID  length code
0  01     230  AAA
1  02     106  BBB
2  03     128  CCC
3  04      96  AAA
4  05     205  CCC

更新了 DataFrame：

   ID  length code  new
0  01     230  AAA  230
1  02     106  BBB    0
2  03     128  CCC  128
3  04      96  AAA   96
4  05     205  CCC  205

在 pandas 中查找先前匹配单元格的索引

问题描述

1 个解决方案

解决方案1
0 已采纳 2021-04-12 23:01:20

在 pandas 中查找先前匹配单元格的索引

问题描述

1 个解决方案

解决方案1 0 已采纳 2021-04-12 23:01:20

解决方案1
0 已采纳 2021-04-12 23:01:20