如何根据pandas中另一个数据框中的条件更新数据框

Question

I have two data frame and I want to update one column of df_source based on the condition in both data frames:我有两个数据框，我想根据两个数据框中的条件更新一列df_source ：

df_source = pd.Dataframe({'Sentiment':['neg', 'neg','pos'], 'text': ['hello ', '12where', 'here [null]'], 'pred': ['neu', 'neg', 'pos')})

df2 = pd.Dataframe({'Sentiment':['pos', 'neg','pos', 'neu'], 'text': ['hello ', '12 where I', 'hello g* ', 'here [null]'], 'pred': ['neu', 'neg', 'neu', 'neu')})

I want to update the column of Sentiment in df_source based on this condition: if the text in both data frame were exactly the same and the pred column was the same then replace the sentiment in df_source with the sentiment in df2我想根据此条件更新df_source的Sentiment列：如果两个数据框中的文本完全相同并且 pred 列相同，则将 df_source 中的情绪替换为 df2 中的情绪

So the output would be like this (as only one sample meets both condition "hello "):所以输出将是这样的（因为只有一个样本满足两个条件“hello”）：

Sentiment.    text.        pred
pos          hello         neu
neg          12where       neg
pos          here [null]   pos

What I have done:我做了什么：

df_source['Sentiment'] = df.where(((df['text']== df_source['text']) & (df['pred'] == dfs['pred'])) , df['Sentiment'])

It should work but this raises error ( ValueError: Can only compare identically-labeled Series objects ).它应该可以工作，但这会引发错误（ ValueError: Can only compare identically-labeled Series objects ）。

Answer 1

First merge on the two columns and suffix.首先在两列和后缀上合并。

df_source = df_source.merge(df2, how ='left', on =['text', 'pred'], suffixes=('_x', ''))

Replace the NaNs where there was no match using combine_first and then drop the extra merge column使用 combine_first 替换不匹配的 NaN，然后删除额外的合并列

df_source =df_source.assign(Sentiment= df_source['Sentiment'].combine_first(df_source.Sentiment_x) ).drop('Sentiment_x',1)

 

          text pred Sentiment
0       hello   neu       pos
1      12where  neg       neg
2  here [null]  pos       pos

如何根据pandas中另一个数据框中的条件更新数据框

问题描述

1 个解决方案

解决方案1
2 已采纳 2021-10-13 22:52:23

如何根据pandas中另一个数据框中的条件更新数据框

问题描述

1 个解决方案

解决方案1 2 已采纳 2021-10-13 22:52:23

解决方案1
2 已采纳 2021-10-13 22:52:23