简体   繁体   中英

How to remove a string with specific pattern without using REGEX?

I am having a difficult time with these sets of inputs and outputs:

input: so sh [/] she had a [^ wheee] .
output: so sh [/] she had a .

input: aah [!] [^ makes sound effects] .
output: aah.

input: and she say (.) I got it [^ repeats 2 times] .
output: and she say (.) I got it .

input: oh no[x 3] .
output: oh  no.


input: xxx [^ /bosolasafiso/]
output: xxx

input: hi [* med]
oupt: hi [* med]

I have used REGEX but no use, I need exact conditions to make all these satisfy and the resultant output should be returned.

All the "INPUTS" are being read from a file so please be noted that even if i use "split()" the words like [^ whee] will be treated as two different words.

I need a condition where only words that contains [/] [* should be retained. other words that starts with "[" should be replaced with an empty string.

The following solution works, assuming that there are no curly braces in your original text. Otherwise, use some other pair of delimiters (eg, << and >> ).

s1 = 'so sh [/] [* med] she had a [^ wheee] .' 

First, replace [ and ] in each [/ X] or [* X] fragment with a { and } , respectively, to protect them from elimination. Then eliminate all survising fragments in square brackets. Finally, replace all curly braces back to square brackets:

re.sub(r"\[[^]]*]", "", # Remove [Y] blocks
        re.sub(r"\[([/*][^]]*)]", r"{\1}", s1)) # Rename [X] to {X}\
  .replace("{", "[") # Restore the original brackets\
  .replace("}", "]")
#'so sh [/] [* med] she had a  .'

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM