正则表达式-特殊字符的字符类

Question

I need to write a regular expression in Python that will capture some text which could possibly include any special character (like !@#$%^). 我需要在Python中编写一个正则表达式，以捕获可能包含任何特殊字符（如！@＃$％^）的某些文本。 Is there a character class similar to [\\w] or [\\d] that will capture any special character? 是否有类似于[\\ w]或[\\ d]的字符类可以捕获任何特殊字符？

I could write down all the special characters in my regex but it would end up looking unreadable. 我可以在正则表达式中写下所有特殊字符，但最终看起来不可读。 Any help appreciated. 任何帮助表示赞赏。

Answer 1

Special letter characters 特殊字母字符

Python 3 Python 3

If you're using Python3, you might not have to do anything. 如果您使用的是Python3，则可能无需执行任何操作。 \\w already includes many "special characters" : \\w已经包含许多“特殊字符”：

>>> import re
>>> re.findall('\w', 'üäößéÅßêèiìí')
['ü', 'ä', 'ö', 'ß', 'é', 'Å', 'ß', 'ê', 'è', 'i', 'ì', 'í']

Python 2.7 Python 2.7

In Python2.7, only i would be matched by default \\w : 在Python2.7中，默认情况下，只有i会被匹配\\w ：

>>> import re
>>> re.findall('\w', 'üäößéÅßêèiìí')
['i']

You could use re.UNICODE : 您可以使用re.UNICODE ：

# encoding: utf-8
import re
any_char = re.compile('\w', re.UNICODE)
re.findall(any_char, u'üäößéÅßêèiìí')
# [u'\xfc', u'\xe4', u'\xf6', u'\xdf', u'\xe9', u'\xc5', u'\xdf', u'\xea', u'\xe8', u'i', u'\xec', u'\xed']
for x in re.findall(any_char, u'üäößéÅßêèiìí'):
    print x
#   ü
#   ä
#   ö
#   ß
#   é
#   Å
#   ß
#   ê
#   è
#   i
#   ì
#   í

Any special character 任何特殊字符

Specifying unicode ranges might simplify your regex. 指定unicode范围可能会简化您的正则表达式。 As an example, this regex match any unicode arrow : 例如，此正则表达式匹配任何unicode箭头：

>>> import re
>>> arrows = re.compile(r'[\u2190-\u21FF]')
>>> re.findall(arrows, "a⇸b⇙c↺d↣e↝f")
['⇸', '⇙', '↺', '↣', '↝']

For Python2, you'd need to specify unicode string and regex : 对于Python2，您需要指定unicode字符串和regex：

>>> import re
>>> arrows = re.compile(ur'[\u2190-\u21FF]')
>>> re.findall(arrows, u"a⇸b⇙c↺d↣e↝f")
[u'\u21f8', u'\u21d9', u'\u21ba', u'\u21a3', u'\u219d']

Answer 2

您可以尝试使用与任何非单词或非数字字符匹配的否定版本（\\ W，\\ D）。

正则表达式-特殊字符的字符类

问题描述

2 个解决方案

解决方案1
0 2017-03-22 13:13:56

Special letter characters 特殊字母字符

Python 3 Python 3

Python 2.7 Python 2.7

Any special character 任何特殊字符

解决方案2
0 2017-03-22 13:14:03

正则表达式-特殊字符的字符类

问题描述

2 个解决方案

解决方案1 0 2017-03-22 13:13:56

Special letter characters 特殊字母字符

Python 3 Python 3

Python 2.7 Python 2.7

Any special character 任何特殊字符

解决方案2 0 2017-03-22 13:14:03

解决方案1
0 2017-03-22 13:13:56

解决方案2
0 2017-03-22 13:14:03