[英]How to use re.sub() to leave only letters a-z, A-Z, numbers 0-9 and spaces but not divide numbers?
message = 'Hello(/ how{can} wan\';t //opperate+32.5 u&# kj|'
I need to leave only letters az, AZ, numbers 0-9 and spaces, so I must get 'Hello how can wan t opperate 325 u kj'
but when I use re.sub('[^\w\d]+', ' ', message)
or re.sub('[^A-Za-z0-9]+', ' ', message)
I get 'Hello how can wan t opperate 32 5 u kj'
How can I get 325 as a number?我只需要留下字母 az、AZ、数字 0-9 和空格,所以我必须得到
'Hello how can wan t opperate 325 u kj'
但是当我使用re.sub('[^\w\d]+', ' ', message)
或re.sub('[^A-Za-z0-9]+', ' ', message)
我得到'Hello how can wan t opperate 32 5 u kj'
我怎样才能得到 325一个号码?
You can use您可以使用
re.sub(r'(\d+(?:[,.]\d+)+)|[\W_]+', lambda x: x.group(1) if x.group(1) else ' ', message).strip()
See the Python demo online .在线查看 Python 演示。
Details :详情:
(\d+(?:[,.]\d+)+)
- Capturing group 1: one or more digits followed with one or more occurrences of a .
(\d+(?:[,.]\d+)+)
- 捕获组 1:一个或多个数字后跟一个或多个出现的 a .
or ,
and one or more digits,
和一个或多个数字|
- or [\W_]+
- any one or more non-alphanumeric chars. [\W_]+
- 任何一个或多个非字母数字字符。 If Group 1 matches, the replacement is Group 1 value, else, the replacement is a space.如果 Group 1 匹配,则替换为 Group 1 值,否则,替换为空格。 If there is a match at the start/end of the string, there may be a stray space left, hence, using
strip()
.如果在字符串的开头/结尾有匹配项,则可能会留下一个杂散空间,因此使用
strip()
。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.