简体   繁体   English

RegEx获取除字母,空格,'和-以外的所有内容

[英]RegEx to get everything except letters,spaces, ', and -

Is there a more simplified regular expression to match anything that is not a letter, hypen, space, or apostrophe? 是否有更简化的正则表达式来匹配不是字母,连字符,空格或撇号的任何内容?

This is the regex I was using... 这是我使用的正则表达式...

[^\w\s'-]|\d|_|\xa0

It's working, I was just curious if there was a more simplified expression 运行正常,我只是好奇是否有一个更简化的表达方式

[^a-zA-Z-' ]

匹配除字母Az,连字符,空格和撇号以外的所有字符

\\w already includes \\d and _ . \\w已经包含\\d_ So the simplest regex will be: 因此,最简单的正则表达式为:

[^\w\s\-']

The following pattern... 以下模式...

[^a-z- ']

...is simpler and should do what you want with case-insensitivity set: ...更简单,应该使用不区分大小写的设置来完成您想做的事情:

import re
p = re.compile(ur'[^a-z- \']', re.IGNORECASE)
test_str1 = u"9"
test_str2 = u"["
test_str3 = u"_"

re.search(p, test_str1)
re.search(p, test_str2)
re.search(p, test_str3)

Mirroring Maroun Maroun's comment, \\w matches _ ; 反映Maroun Maroun的注释, \\w匹配_ ; it also matches 0-9 : so saying "not az or AZ or 0-9 or _ " with [^\\w ... ] ...then saying " 0-9 or _ " with |\\d|_ is a bit confusing and needlessly complicating. 它也匹配0-9 :所以用[^\\w ... ]说“不是azAZ 0-9 _ ” ...然后用|\\d|_说“ 0-9_ ”是一个有点混乱和不必要的复杂。

Same with \\s , as it matches more than a space (specifically a carriage return, new line, tab, or form feed), which does not jive with wanting to match "anything that is not... a space ...": given your description then, use a literal \\s相同,因为它匹配的不仅仅是一个空格(特别是回车符,换行符,制表符或换页符),它与希望匹配“不是空格的任何内容……”不匹配:然后根据您的描述使用文字 over the \\s character class. \\s字符类上。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM