如何根據pandas中另一個數據框中的條件更新數據框

Question

我有兩個數據框，我想根據兩個數據框中的條件更新一列df_source ：

df_source = pd.Dataframe({'Sentiment':['neg', 'neg','pos'], 'text': ['hello ', '12where', 'here [null]'], 'pred': ['neu', 'neg', 'pos')})

df2 = pd.Dataframe({'Sentiment':['pos', 'neg','pos', 'neu'], 'text': ['hello ', '12 where I', 'hello g* ', 'here [null]'], 'pred': ['neu', 'neg', 'neu', 'neu')})

我想根據此條件更新df_source的Sentiment列：如果兩個數據框中的文本完全相同並且 pred 列相同，則將 df_source 中的情緒替換為 df2 中的情緒

所以輸出將是這樣的（因為只有一個樣本滿足兩個條件“hello”）：

Sentiment.    text.        pred
pos          hello         neu
neg          12where       neg
pos          here [null]   pos

我做了什么：

df_source['Sentiment'] = df.where(((df['text']== df_source['text']) & (df['pred'] == dfs['pred'])) , df['Sentiment'])

它應該可以工作，但這會引發錯誤（ ValueError: Can only compare identically-labeled Series objects ）。

Answer 1

首先在兩列和后綴上合並。

df_source = df_source.merge(df2, how ='left', on =['text', 'pred'], suffixes=('_x', ''))

使用 combine_first 替換不匹配的 NaN，然后刪除額外的合並列

df_source =df_source.assign(Sentiment= df_source['Sentiment'].combine_first(df_source.Sentiment_x) ).drop('Sentiment_x',1)

 

          text pred Sentiment
0       hello   neu       pos
1      12where  neg       neg
2  here [null]  pos       pos

如何根據pandas中另一個數據框中的條件更新數據框

問題描述

1 個解決方案

解決方案1
2 已采納 2021-10-13 22:52:23

如何根據pandas中另一個數據框中的條件更新數據框

問題描述

1 個解決方案

解決方案1 2 已采納 2021-10-13 22:52:23

解決方案1
2 已采納 2021-10-13 22:52:23