简体   繁体   English

读取文件导致的错误

[英]bug resulting from reading file

I am facing a rather elusive bug which seems to be caused by reading from a file. 我正面临一个相当难以捉摸的错误,这似乎是由于从文件中读取而引起的。 I have simplified my program to demonstrate the issue: 我简化了我的程序来演示这个问题:

consider this program that works fine: 认为这个程序工作正常:

import re

sourceString="Yesterday I had a pizza for lunch it was tasty\n";
sourceString+="today I am thinking about grabbing a burger and tomorrow it\n"; 
sourceString+="will probably be some fish if I am lucky\n\n\n";
sourceString+="see you later!"

jj=["pizza","steak","fish"]

for keyword in jj:
    regexPattern= keyword+".*";
    patternObject=re.compile(regexPattern,re.MULTILINE);
    match=patternObject.search(sourceString);
    if match:
        print("Match found for "+keyword)
        print(match.group()+"\n")
    else:
        print("warning: no match found for :"+ keyword+"\n")

I am using a very straightforward regex pattern but I am getting the gist of the regex from my array jj 我正在使用一个非常简单的正则表达式模式,但我从我的数组jj获得正则表达式的要点

the script works as expected (matches patterns containing "pizza" and "fish" but does not match "steak") 该脚本按预期工作(匹配包含“披萨”和“鱼”但与“牛排”不匹配的模式)

now In my actual program I am trying to read these keywords from a file (I don't want to hardcode in the source) 现在在我的实际程序中,我试图从文件中读取这些关键字(我不想在源代码中硬编码)

so far I have this: 到目前为止我有这个:

import re

sourceString="Yesterday I had a pizza for lunch it was tasty\n";
sourceString+="today I am thinking about grabbing a burger and tomorrow it\n"; 
sourceString+="will probably be some fish if I am lucky\n\n\n";
sourceString+="see you later!"

with open('keyWords.txt','r') as f: 
    for keyword in f:
        regexPattern= keyword+".*";
        patternObject=re.compile(regexPattern,re.MULTILINE);
        match=patternObject.search(sourceString);
        if match:
            print("Match found for "+keyword)
            print(match.group())
        else:
            print("warning: no match found for :"+ keyword)

where keyWords.txt will contain the following: 其中keyWords.txt将包含以下内容:

pizza
steak
fish

but this breaks the code because somehow only the LAST keyword in the file will successfully match (if a match exists). 但是这会破坏代码,因为只有文件中的LAST关键字才能成功匹配(如果存在匹配)。

What gives? 是什么赋予了?

with open('keyWords.txt','r') as f: 
    for keyword in f:
        regexPattern = keyword.strip() + ".*";

Use strip() to remove any newline characters from keyword . 使用strip()keyword删除任何换行符 If you know for certain that there won't be any leading whitespace, rstrip() is sufficient. 如果您确定不会有任何前导空格,则rstrip()就足够了。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM