[英]Regex matching non-alphanumeric characters
I'm using Python to parse some strings in a list. 我正在使用Python来解析列表中的一些字符串。 Some of the strings may only contain non-alphanumeric characters which I'd like to ignore, like this:
有些字符串可能只包含我想忽略的非字母数字字符,如下所示:
list = ['()', 'desk', 'apple', ':desk', '(house', ')', '(:', ')(', '(', ':(', '))']
for item in list:
if re.search(r'\W+', item):
list.remove(item)
# Ideal output
list = ['desk', 'apple', ':desk', '(house']
# Actual output
list = ['desk', 'apple', '(:', '(', '))']
That's my first attempt at the regex for this problem, but it's not really having the desired effect. 这是我对这个问题的正则表达式的第一次尝试,但它并没有真正达到预期的效果。 How would I write a regex to ignore any strings with non-alphanumeric characters?
如何编写正则表达式来忽略任何带有非字母数字字符的字符串?
BTW your Regex seems to match non-alphanumeric characters. 顺便说一句,你的正则表达式似乎与非字母数字字符匹配。 However it isn't advisable to remove items from a list your currently iterating over and that's the cause of this error therefore to overcome this create a new list and append to it the elements which don't match.
但是, 建议不要从当前迭代的列表中删除项目,这是导致此错误的原因, 因此要克服此错误,请创建一个新列表并将不匹配的元素附加到该列表中。
Demo: 演示:
import re
list = ['()', 'desk', 'apple', ':desk', '(house', ')', '(:', ')(', '(', ':(', '))']
new_list = []
for item in list:
if not re.search(r'^\W+$', item) or re.search(r'^\w+', item) :
new_list.append(item)
print new_list
Produces: 生产:
['desk', 'apple', ':desk', '(house']
As far as I tested this works in nearly all scenarios. 据我测试,这几乎适用于所有场景。
What about a list comprehension with re.match(pattern, string)
: 如何使用
re.match(pattern, string)
进行列表理解:
import re
items = ['()', 'desk', 'apple', ')', '(:', ')(', '(', ':(', '))']
cleaned_items = [item for item in items if re.match('\W?\w+', item)]
print cleaned_items
This prints 这打印
['desk', 'apple', ':desk', '(house']
The problem is not with your regex. 问题不在于你的正则表达式。 You are iterating over a list which you are then modifying, which causes weirdness (see Modifying list while iterating ).
您正在迭代您正在修改的列表,这会导致奇怪(请参阅迭代时修改列表 )。 You can use a list comprehension like Jon posted, or you can iterate over a copy of the list:
for item in list[:]:
您可以使用像Jon发布的列表推导,或者您可以迭代列表的副本:
for item in list[:]:
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.