[英]RegEx to get everything except letters,spaces, ', and -
Is there a more simplified regular expression to match anything that is not a letter, hypen, space, or apostrophe? 是否有更简化的正则表达式来匹配不是字母,连字符,空格或撇号的任何内容?
This is the regex I was using... 这是我使用的正则表达式...
[^\w\s'-]|\d|_|\xa0
It's working, I was just curious if there was a more simplified expression 运行正常,我只是好奇是否有一个更简化的表达方式
[^a-zA-Z-' ]
匹配除字母Az,连字符,空格和撇号以外的所有字符
\\w
already includes \\d
and _
. \\w
已经包含\\d
和_
。 So the simplest regex will be: 因此,最简单的正则表达式为:
[^\w\s\-']
The following pattern... 以下模式...
[^a-z- ']
...is simpler and should do what you want with case-insensitivity set: ...更简单,应该使用不区分大小写的设置来完成您想做的事情:
import re
p = re.compile(ur'[^a-z- \']', re.IGNORECASE)
test_str1 = u"9"
test_str2 = u"["
test_str3 = u"_"
re.search(p, test_str1)
re.search(p, test_str2)
re.search(p, test_str3)
Mirroring Maroun Maroun's comment, \\w
matches _
; 反映Maroun Maroun的注释,
\\w
匹配_
; it also matches 0-9
: so saying "not az
or AZ
or 0-9
or _
" with [^\\w
... ]
...then saying " 0-9
or _
" with |\\d|_
is a bit confusing and needlessly complicating. 它也匹配
0-9
:所以用[^\\w
... ]
说“不是az
或AZ
或 0-9
或 _
” ...然后用|\\d|_
说“ 0-9
或_
”是一个有点混乱和不必要的复杂。
Same with \\s
, as it matches more than a space (specifically a carriage return, new line, tab, or form feed), which does not jive with wanting to match "anything that is not... a space ...": given your description then, use a literal 与
\\s
相同,因为它匹配的不仅仅是一个空格(特别是回车符,换行符,制表符或换页符),它与希望匹配“不是空格的任何内容……”不匹配:然后根据您的描述使用文字 over the
\\s
character class. 在
\\s
字符类上。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.