简体   繁体   English

Python:脚本在文本文件中找不到单词

[英]Python: Script won't find word in text file

I am trying to find specific words from a text file, however my script doesn't seem to be able to match the word to what's written on a line in the text file, even though I know it matches. 我正在尝试从文本文件中找到特定的单词,但是我的脚本似乎无法将单词与文本文件中一行所写的内容相匹配,即使我知道它可以匹配。 I've noticed there are spaces but since I am saying entry in line , shouldn't it work? 我已经注意到有空格,但是既然我说的entry in line ,那行不行吗?

I have also tried: 我也尝试过:

  if str(entry) in line:, 
  if str(entry) in str(line): and 
  if entry in str(line): 

but none of them seem to work either 但它们似乎都不起作用

I can't see where I'm going wrong. 我看不到我要去哪里错了。 Any help would be appreciated. 任何帮助,将不胜感激。

Here is my code 这是我的代码

with open(address+'file_containing_data_I_want.txt') as f:
    for entry in System_data:
        print "Entry:"
        print entry 
        for line in f:
            print "Start of line"
            print line
            print"End of line"
            if entry in line:
                print "Found entry in line" #This never gets printed

Using the print statements (for just the first entry) I see: 使用打印语句(仅针对第一个条目),我看到:

Entry:
Manufacturer


Start of line
??

End of line
Start of line


End of line
Start of line
Manufacturer=manufacturer_data

End of line
Start of line
Model=model_data

End of line
Start of line


End of line
Start of line


End of line

The text file looks like this (Note:I can't change the text file as this is the way I will be receiving it, ' indicates a blank line): 文本文件如下所示(注意:我无法更改文本文件,因为这是我将收到的方式, '表示空白行):

'
'
Manufacturer=manufacturer_data
Model=model_data
'
'
'

UPDATE: Changing my script to: 更新:将我的脚本更改为:

with open(address+'file_containing_data_I_want.txt') as f:
    for line in f:
        print "Start of line %s" % line
        print"End of line" 
        for entry in System_data:
            print "Entry: %s" % entry
            if entry in line.strip():
                print "Found entry in line"

Results in this being printed (Still no "Found entry in line"): 结果被打印出来(仍然没有“在行中找到条目”):

Entry: Manufacturer
Entry: Model
Start of line: 
End of line
Entry: Manufacturer
Entry: Model
Start of line: Manufacturer=manufacturer_data
End of line
Entry: Manufacturer
Entry: Model
Start of line: Model=model_data
Entry: Manufacturer
Entry: Model
Start of line: 
End of line
Entry: Manufacturer
Entry: Model
Start of line: 
End of line

Changing my code to this: 将我的代码更改为此:

for line in f:
    print "Start of line: %s" % line.strip("\r\n")
    print "End of line" 
    for entry in System_data:
        print "Entry: %s" % entry.strip()
        if entry.strip() in line.strip("\r\n"):
            print "FOUND!!!!!!!!!!!!!"

Gives me this: 给我这个:

Start of line: ??
End of line
Entry: Manufacturer
Entry: Model
Start of line: 
End of line
Entry: Manufacturer
Entry: Model
Start of line: Manufacturer=manufacturer_data
End of line
Entry: Manufacturer
Entry: Model
Start of line: Model=model_data
End of line

You read to the end of the file the after the first loop. 您在第一个循环之后读取到文件末尾。 Swap the loops instead, so each entry in System_data gets checked at each line of the file: 相反,请交换循环,以便在文件的每一行都检查System_data每个entry

for line in f:
    print "Start of line %s" % line
    print "End of line" 
    for entry in System_data:
        print "Entry: %s" % entry
        if entry.strip() in line.strip("\r\n"):
            print "Found entry in line" #This now gets printed

or you can correct this behavior in your current code by calling f.seek(0) before for line in f 或者您可以通过for line in f之前调用f.seek(0)在当前代码中更正此行为

You should strip all blanks/newlines from both the entry and lines in file. 您应该从文件的条目和行中删除所有空格/换行符。 So, prefix everything with 因此,请在所有内容前加上前缀

entry = entry.strip()

and change the 并更改

if entry in line:

to

if entry in line.strip():

EDIT: also, what Moses Koledoye says 编辑:还有,摩西·科莱多耶(Moses Koledoye)说的

Ok so it seems the issue was that the string was actually in hexadecimal form. 好的,所以看来问题在于该字符串实际上是十六进制形式。 But it only appeared in hexadecimal form to me when I used print repr(line) it appeared like: '\\x00m\\x00a\\x00n\\x00u\\x00f\\x00a\\x00c\\x00t\\x00u\\x00r\\x00e\\x00r\\x00_\\x00d\\x00a\\x0‌​0t\\x00a\\x00' 但是当我使用print repr(line)时,它只以十六进制形式出现: '\\x00m\\x00a\\x00n\\x00u\\x00f\\x00a\\x00c\\x00t\\x00u\\x00r\\x00e\\x00r\\x00_\\x00d\\x00a\\x0‌​0t\\x00a\\x00'

So I changed my code to the following: 所以我将代码更改为以下内容:

with open(address+'file_containing_data_I_want.txt') as f:
    for line in f:
        for entry in System_data:
            line=line.strip()
            line = re.sub(r'[^\w=]', '', line)
            if entry in line:
                print "Found entry in line"

This script now enters the loop if entry in line: and prints "Found entry in line" 现在, if entry in line:输入以下内容,此脚本将进入循环if entry in line:并显示"Found entry in line"

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM