[英]Python For-Loop with Regex
blank blank blank blank blank blank空白 空白 空白 空白 空白
The Issue问题
Your test to see if re.findall
has returned the value you want is bugged.您检查
re.findall
是否返回了您想要的值的测试有问题。 Your check:您的支票:
if test != None:
Will always be true, and you'll always append whatever value word
holds to wkar
.将永远是真的,你将永远是 append 任何值
word
持有wkar
。 From the re docs (assuming python3, but the behavior doesn't change):从re docs(假设python3,但行为没有改变):
re.findall( pattern, string, flags=0 )
re.findall(模式,字符串,标志=0 )
Return all non-overlapping matches of pattern in string, as a list of strings... Empty matches are included in the result .
返回字符串中模式的所有非重叠匹配,作为字符串列表......结果中包含空匹配。
(emphasis mine) (强调我的)
An empty list is not None
, that's wkar
holds all the values in your sentence.空列表不是
None
,那是wkar
包含您句子中的所有值。 (Interestingly, this is the exact opposite of the behavior you mentioned at the beginning of your question.) (有趣的是,这与您在问题开头提到的行为完全相反。)
The Solution解决方案
Don't use a regex, it's the wrong tool for this job.不要使用正则表达式,它是这项工作的错误工具。 This can be solved using builtin functions.
这可以使用内置函数来解决。 Additionally, you're taking a performance hit for something that can just be done in an if statement
此外,您会因为一些可以在 if 语句中完成的事情而受到性能影响
# use the builtin split function to split sentence on spaces
sentence = sentence.split(" ")
wkar = []
# iterate over each word...
for word in sentence:
#...and see if it matches the test word
if word == 'text':
wkar.append(word)
The re.findall()
function does loop over the entire sentence. re.findall()
function 确实循环了整个句子。 you don't have to do that part yourself.你不必自己做那部分。 all you have to do to have the output you want is to do the following:
要拥有您想要的 output,您只需执行以下操作:
import re
sentence = 'This is some Text, then some more text with some Numbers 1357, and even more text 357, the end.'
wkar = re.findall(r'text', sentence)
that will result in:这将导致:
['text', 'text']
and if you want re.findall()
to be case-insensitive use:如果您希望
re.findall()
不区分大小写,请使用:
wkar = re.findall(r'text', sentence, flags=re.IGNORECASE)
that will give:这将给出:
['Text', 'text', 'text']
Also in the future if you want to test regular expressions i suggest you use the great https://regex101.com/ website (make sure to chose the python button for the python regex string format).同样在将来,如果您想测试正则表达式,我建议您使用出色的https://regex101.com/网站(确保为 python 正则表达式格式字符串选择 python 按钮)。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.