简体   繁体   中英

Cleaning string by removing symbols

I have to clean a string removing special symbols /#$%^&*@0123456789 only if they are separated from each other by letters or symbols not in the list. Example:

H8e%&l6&%l@8095o a@/9^65$n228d w%e60$$#&9l3@/c6o5m3e --> Hello and welcome
I1^/0^^@9t #$%% i/@4#s 11P17/9$M 5^&* a^$45$5$0n&##^4d 6^&&* I $%^$%^ a8@@94%3*m t3120i36&^1r2&^##0e&^d ---> It #$%% is 11PM 5^&* and 6^&&* I $%^$%^ am tired
,. a3%2%1/3$s*0. d8^! -->,. as. d!
##%12Symbols on the left must remain untouched --> ##%12Symbols on the left must remain untouched

I figured out that it's possible to do using re.sub :

import re
def _correct_message(message):
    new_final_string = re.sub("(?<=[a-zA-Z\.\!])[/#\$\%\^\&\*\@0123456789]+(?=[a-zA-Z\.\!])", '', message)
    return new_final_string

But I don't like the fact that I have to add symbols which is not in the list .!.? manually. Is it possible to make it without regex ?

It is the closest I could get:

 (\W+|\d{1,}(?!\d\[A-Za-z]))(?![A-Za-z]{2,})

Just replace all matches with space

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM