简体   繁体   中英

how remove special characters from the end of every word in a string?

i want it match only the end of every word

example:

"i am test-ing., i am test.ing-, i am_, test_ing," 

output should be:

"i am test-ing i am test.ing i am test_ing"
>>> import re
>>> test = "i am test-ing., i am test.ing-, i am_, test_ing,"
>>> re.sub(r'([^\w\s]|_)+(?=\s|$)', '', test)
'i am test-ing i am test.ing i am test_ing'

Matches one or more non-alphanumeric characters ( [^\\w\\s]|_ ) followed by either a space ( \\s ) or the end of the string ( $ ). The (?= ) construct is a lookahead assertion: it makes sure that a matching space is not included in the match, so it doesn't get replaced; only the [\\W_]+ gets replaced.

Okay, but why [^\\w\\s]|_ , you ask? The first part matches anything that's not alphanumeric or an underscore ( [^\\w] ) or whitespace ( [^\\s] ), ie punctuation characters. Except we do want to eliminate underscores, so we then include those with |_ .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM