简体   繁体   中英

Regex Search and Replace in Python

I am looking to do a Regex conditional search.

What I am looking to do is if there is Carriage Return (\\r) followed by Upper and Lower Case alphabets the I want to add space ('') and remove carriage return but if after carriage there is anything else I just want to replace that. Is there a way I can do that using regex in Python

Sample Input:

BCP-\\rEngin\\reerin\\rg\\rSyste\\rms\\rSupp\\rort

Output:

BCP- Engineering Systems Support

Data is in form of dataframe. I am currently using df.replace() function to replace "\\r" with spaces (" ") but I would like it to be conditional.

Below is my code -

df_replace = df.replace(to_replace=r"\r", value = " ", regex=True)

I am not familiar with python, but the regex you will need is as follows (perhaps someone with python experience can edit to customize this code):

This will find all \\r that precede an uppercase letter, so replace this with an empty string :

\\r(?![A-Z])

This will find all \\r that precede a lowercase letter, so replace this with a space :

\\r(?![a-z])

EDIT

Okay, here's one solution in Python I was able to put together for you:

import re

myString = "BCP-\rEngin\reerin\rg\rSyste\rms\rSupp\rort"

myString = re.sub("\\r(?![A-Z])", "", myString)
myString = myString.replace("\r", " ")  # This can be simple string replace

I was able to get the solution for this -

df_replace2 =  df.replace(to_replace = r"(\r)(?![A-Z])", value = "", regex=True)
df_replace3 = df_replace2.replace(to_replace = r"(\r)(?![a-z])", value = " ", regex=True)

Thanks @Brigadeiro for guiding with the solution

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM