I am having a difficult time with these sets of inputs and outputs:
input: so sh [/] she had a [^ wheee] .
output: so sh [/] she had a .
input: aah [!] [^ makes sound effects] .
output: aah.
input: and she say (.) I got it [^ repeats 2 times] .
output: and she say (.) I got it .
input: oh no[x 3] .
output: oh no.
input: xxx [^ /bosolasafiso/]
output: xxx
input: hi [* med]
oupt: hi [* med]
I have used REGEX but no use, I need exact conditions to make all these satisfy and the resultant output should be returned.
All the "INPUTS" are being read from a file so please be noted that even if i use "split()" the words like [^ whee] will be treated as two different words.
I need a condition where only words that contains [/] [*
should be retained. other words that starts with "[" should be replaced with an empty string.
The following solution works, assuming that there are no curly braces in your original text. Otherwise, use some other pair of delimiters (eg, <<
and >>
).
s1 = 'so sh [/] [* med] she had a [^ wheee] .'
First, replace [
and ]
in each [/ X]
or [* X]
fragment with a {
and }
, respectively, to protect them from elimination. Then eliminate all survising fragments in square brackets. Finally, replace all curly braces back to square brackets:
re.sub(r"\[[^]]*]", "", # Remove [Y] blocks
re.sub(r"\[([/*][^]]*)]", r"{\1}", s1)) # Rename [X] to {X}\
.replace("{", "[") # Restore the original brackets\
.replace("}", "]")
#'so sh [/] [* med] she had a .'
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.