简体   繁体   中英

How to replace specific last string in a dataframe column

I am trying to replace the last word in a dataframe column if it is a specific text. Below is my code

import pandas as pd

lst = ['Main Close', 'Jon cl', 'Boon lose', 'Saint Cls', 'Brook CL','Smith clo', 'Petes Cl', 'Klein Cl.', 'Chuks Close']
df = pd.DataFrame(lst, columns = ['address'])

replace_values = {'Cl$' : 'Close', 'lose$' : 'Close', 'close$' : 'Close', 'cl$' : 'Close', 'Cl.$' : 'Close', 'CL$' : 'Close', 'clo$' : 'Close', 'CI$' : 'Close'}

for key, value in replace_values.items():
    df.address = df['address'].str.replace(key, value)

I used a dictionary to store the search value and replacement value. I am having problem as it is not matching the specific text ie

Main Close is modified to Main CClose but it should be ignored

Pete Cl is modified to Petes CClose but it should be Petes Close

What could I be missing. I have tried to use many other solutions from other question but could not figure it out.

Try regex along with df.apply

>>> import re
def f1(s):
    p = re.compile('^(\S+)\s+(cl|cl.|Cl|Cl.|CL|Cls|clo|lose|Close)$')
    return p.sub('\\1 Close', s)

>>> df['address'] = df['address'].apply(f1)

>>> print(df)

    address
0   Main Close
1    Jon Close
2   Boon Close
3  Saint Close
4  Brook Close
5  Smith Close
6  Petes Close
7  Klein Close
8  Chuks Close

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM