简体   繁体   English

pandas:如果该值在第二个 dataframe 中,则根据另一个 dataframe 中的条件替换列中的值

[英]pandas: replace values in a column based on a condition in another dataframe if that value is in the second dataframe

I have two dataframes as follows,我有两个数据框如下,

import pandas as pd
df = pd.DataFrame({'text':['I go to school','open the green door', 'go out and play'],
               'pos':[['PRON','VERB','ADP','NOUN'],['VERB','DET','ADJ','NOUN'],['VERB','ADP','CCONJ','VERB']]})

df2 = pd.DataFrame({'verbs':['go','open','close','share','divide'],
                   'new_verbs':['went','opened','closed','shared','divided']})

I would like to replace the verbs in df.text with their past form in df2.new_verbs if the verbs are found in df2.verbs.如果在 df2.verbs 中找到动词,我想用 df2.new_verbs 中的过去形式替换 df.text 中的动词。 and so far I have done the following,到目前为止,我已经完成了以下工作,

df['text'] = df['text'].str.split()
new_df = df.apply(pd.Series.explode)
new_df = new_df.assign(new=lambda d: d['pos'].mask(d['pos'] == 'VERB', d['text']))
new_df.text[new_df.new.isin(df2.verbs)] = df2.new_verbs

but when I print out the result, not all verbs are correctly replaced.但是当我打印出结果时,并不是所有的动词都被正确替换了。 My desired output would be,我想要的 output 是,

       text    pos    new
0       I   PRON   PRON
0    went   VERB     go
0      to    ADP    ADP
0  school   NOUN   NOUN
1  opened   VERB   open
1     the    DET    DET
1   green    ADJ    ADJ
1    door   NOUN   NOUN
2    went   VERB     go
2     out    ADP    ADP
2     and  CCONJ  CCONJ
2    play   VERB   play

You can use a regex for that:您可以为此使用正则表达式:

import re
regex = '|'.join(map(re.escape, df2['verbs']))
s = df2.set_index('verbs')['new_verbs']

df['text'] = df['text'].str.replace(regex, lambda m: s.get(m.group(), m),
                                    regex=True)

output (here as column text 2 for clarity): output(为清楚起见,此处作为列文本2 ):

                  text                       pos                  text2
0       I go to school   [PRON, VERB, ADP, NOUN]       I went to school
1  open the green door    [VERB, DET, ADJ, NOUN]  opened the green door
2      go out and play  [VERB, ADP, CCONJ, VERB]      went out and play

For smaller lists, you can use pandas replace and a dictionary like this:对于较小的列表,您可以使用 pandas replace和这样的字典:

verbs_map = dict(zip(df2.verbs, df2.new_verbs))
new_df.text.replace(verbs_map)

Basically, dict(zip(df2.verbs, df2.new_verbs) creates a new dictionary that maps old verbs to their new (past tense) verbs, eg {'go': 'went', 'close': 'closed', ...} .基本上, dict(zip(df2.verbs, df2.new_verbs)创建了一个新字典,将旧动词映射到它们的新(过去时)动词,例如{'go': 'went', 'close': 'closed', ...}

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何根据第二个 Dataframe 值的条件替换 Dataframe 列值 - How to Replace Dataframe Column Values Based on Condition of Second Dataframe Values 将python pandas df替换为基于条件的第二个数据帧的值 - Replace python pandas df with values of a second dataframe based with condition 如何根据条件将大熊猫数据框中某个范围内的值替换为同一数据框中的另一个值 - How to replace values in a range in a pandas dataframe with another value in the same dataframe based on a condition Pandas数据框根据查询数据框中的值选择行,然后根据列值选择其他条件 - Pandas Dataframe Select rows based on values from a lookup dataframe and then another condition based on column value Pandas:根据条件用另一个数据帧值替换数据帧中的值 - Pandas: replace values in dataframe with another dataframes values based on condition 如何在给定条件的情况下用 pandas dataframe 中的另一列替换一列的值 - How to replace the values of a column with another column in pandas dataframe given a condition 根据另一列的值替换Pandas数据框的Column的值 - Replace values of a Pandas dataframe's Column based on values of another column 根据条件,用相应的列名替换 pandas 数据框中的特定值, - Replace specific values in pandas dataframe with the corresponding column name, based on a condition, 根据条件从另一个 dataframe 值替换列的值 - Python - Replace values of a column from another dataframe values based on a condition - Python 根据条件用不同的替换字典替换熊猫数据框列中的值 - Replace values in pandas dataframe column with different replacement dict based on condition
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM