從另一個 dataframe 更新 pandas dataframe 中的特定值

Question

我有一個包含聊天記錄的 dataframe：

id     time        author          text
a1    06:15:19     system        aaaaa
a1    13:57:50     Agent(Human)  ssfsd
a1    14:00:05     customer      ddg
a1    14:06:08     Agent(Human)  sdfg
a1    14:08:54     customer      sdfg
a1    15:58:48     Agent(Human)  jfghdfg
a1    16:18:41     customer      urtr
a1    16:51:38     Agent(Human)  erweg

我還有另一個 dataframe 代理，其中包含他們發起聊天的時間。 例如：df2

id    agent_id    agent_time
a1     D01        13:57:50
a1     D02        15:58:48

現在，我希望根據該特定時間使用“agent_id”中的值更新“author”列中的值，並在包含“Agent（Human）”的作者值之間填充它們各自的代理名稱。

所需的最終 output：

id     time        author          text
a1    06:15:19     system        aaaaa
a1    13:57:50     D01           ssfsd
a1    14:00:05     customer      ddg
a1    14:06:08     D01           sdfg
a1    14:08:54     customer      sdfg
a1    15:58:48     D02           jfghdfg
a1    16:18:41     customer      urtr
a1    16:51:38     D02           erweg

我嘗試使用.map() 操作來做到這一點

df1['author'] = df1['time'].map(df2.set_index('agent_time')['agent_id'])

但是我弄錯了 output：

id     time        author          text
a1    06:15:19     NaN           aaaaa
a1    13:57:50     D01           ssfsd
a1    14:00:05     NaN           ddg
a1    14:06:08     NaN           sdfg
a1    14:08:54     NaN           sdfg
a1    15:58:48     D02           jfghdfg
a1    16:18:41     NaN           urtr
a1    16:51:38     NaN           erweg

我也嘗試過 using.loc 方法，但沒有奏效

誰能指導我如何實現所需的 output？ 任何線索都會有所幫助

Answer 1

我認為在您的解決方案中應該添加GroupBy.ffill用於轉發每個id的缺失值和Series.where用於將不匹配的Agent(Human)替換為Author的原始值：

m = df1['author'].eq('Agent(Human)')

df1['author'] = (df1['time'].map(df2.set_index('agent_time')['agent_id'])
                            .groupby(df1['id'])
                            .ffill()
                            .where(m, df1['author']))

print (df1)
   id      time    author     text
0  a1  06:15:19    system    aaaaa
1  a1  13:57:50       D01    ssfsd
2  a1  14:00:05  customer      ddg
3  a1  14:06:08       D01     sdfg
4  a1  14:08:54  customer     sdfg
5  a1  15:58:48       D02  jfghdfg
6  a1  16:18:41  customer     urtr
7  a1  16:51:38       D02    erweg

從另一個 dataframe 更新 pandas dataframe 中的特定值

問題描述

1 個解決方案

解決方案1
1 已采納 2021-02-05 07:31:46

從另一個 dataframe 更新 pandas dataframe 中的特定值

問題描述

1 個解決方案

解決方案1 1 已采納 2021-02-05 07:31:46

解決方案1
1 已采納 2021-02-05 07:31:46