[英]Python Matching Complex String with Regex
I have the following test string:我有以下测试字符串:
test_str = `It isn't directed at all,' said the White Rabbit;
My current regular expression uses re.sub
to filter out the punctuation so that I can do my own operations.我当前的正则表达式使用
re.sub
过滤掉标点符号,以便我可以进行自己的操作。
My current regex is re.sub(r"[^A-Za-z0-9'\\s]", '', test_str)
我当前的正则表达式是
re.sub(r"[^A-Za-z0-9'\\s]", '', test_str)
The output from above is:上面的输出是:
['It', "isn't", 'directed', 'at', "all'", 'said', 'the', 'White', 'Rabbit']
The error can be seen at all'
when it is suppose to be storing all
only.错误可以被视为
all'
当它是假设是存储all
只。
How do you store words with 's
and also ignore '
that comes after a punctuation?你如何用
's
存储单词并忽略标点符号后面的'
? In this case, all,'
.在这种情况下,
all,'
。
Try the following:请尝试以下操作:
import re
test_str = "`It isn't directed at all,' said the White Rabbit;"
a = re.sub(r"[^A-Za-z0-9'\s]", '', test_str)
a = re.sub(r"'[ ]", ' ', a)
print(a)
Try using this regular expression:尝试使用这个正则表达式:
print(re.sub('["!"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~''](?!\w+)', '', test_str))
Output:输出:
It isn't directed at all said the White Rabbit
Here are other solutions这是其他解决方案
re.sub("\'[^\w]",' ', test_str)
re.sub("\'[\s]",' ', test_str)
re.sub("\'(?!\w)",'', test_str)
re.sub("\'(?=\s)",'', test_str)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.