简体   繁体   English

如何根据pandas中另一个数据框中的条件更新数据框

[英]how to update a data frame based on the condition in another data frame in pandas

I have two data frame and I want to update one column of df_source based on the condition in both data frames:我有两个数据框,我想根据两个数据框中的条件更新一列df_source

df_source = pd.Dataframe({'Sentiment':['neg', 'neg','pos'], 'text': ['hello ', '12where', 'here [null]'], 'pred': ['neu', 'neg', 'pos')})

df2 = pd.Dataframe({'Sentiment':['pos', 'neg','pos', 'neu'], 'text': ['hello ', '12 where I', 'hello g* ', 'here [null]'], 'pred': ['neu', 'neg', 'neu', 'neu')})

I want to update the column of Sentiment in df_source based on this condition: if the text in both data frame were exactly the same and the pred column was the same then replace the sentiment in df_source with the sentiment in df2我想根据此条件更新df_sourceSentiment列:如果两个数据框中的文本完全相同并且 pred 列相同,则将 df_source 中的情绪替换为 df2 中的情绪

So the output would be like this (as only one sample meets both condition "hello "):所以输出将是这样的(因为只有一个样本满足两个条件“hello”):

Sentiment.    text.        pred
pos          hello         neu
neg          12where       neg
pos          here [null]   pos

What I have done:我做了什么:

df_source['Sentiment'] = df.where(((df['text']== df_source['text']) & (df['pred'] == dfs['pred'])) , df['Sentiment'])

It should work but this raises error ( ValueError: Can only compare identically-labeled Series objects ).它应该可以工作,但这会引发错误( ValueError: Can only compare identically-labeled Series objects )。

First merge on the two columns and suffix.首先在两列和后缀上合并。

df_source = df_source.merge(df2, how ='left', on =['text', 'pred'], suffixes=('_x', ''))

Replace the NaNs where there was no match using combine_first and then drop the extra merge column使用 combine_first 替换不匹配的 NaN,然后​​删除额外的合并列

df_source =df_source.assign(Sentiment= df_source['Sentiment'].combine_first(df_source.Sentiment_x) ).drop('Sentiment_x',1)

 

          text pred Sentiment
0       hello   neu       pos
1      12where  neg       neg
2  here [null]  pos       pos

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何根据条件将 append 1 个数据帧与另一个数据帧 - How to append 1 data frame with another based on condition 如何根据 pandas 中的条件匹配从另一个数据帧更新数据帧列值 - How to update the data frame column values from another data frame based a conditional match in pandas 基于多种条件的Pandas Data Frame isnan更新 - pandas Data Frame isnan update based on multiple condition 根据 pandas 数据框中另一列中的条件对一列求和 - Summing a column based on a condition in another column in a pandas data frame 从具有基于另一列的条件的 pandas 数据帧中删除重复项 - Removing duplicates from pandas data frame with condition based on another column 根据条件将值从一个pandas数据帧替换为另一个pandas数据帧 - Substitute values from one pandas data frame to another based on condition 如何根据 pandas 数据帧中的条件减去时间数据类型 - how to subtract time data type based on condition in pandas data frame 在 Pandas 数据框中快速搜索并根据条件在数据框的另一列中插入值 - Fast search in pandas data frame and inserting values in another column of the data frame based on a condition 通过根据另一个数据帧上的条件选择pandas数据框列来创建numpy数组 - Creating a numpy array by selecting pandas data frame columns based on condition on another data frame 根据为pandas中另一个数据框中的列提供的条件对数据框的列执行操作 - perform operation on column of data frame based on condition given to column in another data frame in pandas
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM