I want to search from a dictionary if one of its words is in a second txt file. I have problem with the following code:
print 'Searching for known strings...\n'
with open('something.txt') as f:
haystack = f.read()
with open('d:\\Users\\something\\Desktop\\something\\dictionary\\entirelist.txt') as f:
for needle in (line.strip() for line in f):
if needle in haystack:
print line
The with open
statements are not from me, I took them from: Search for strings listed in one file from another text file? I want to print the line so I wrote line instead of needle. Problems comes : it says line is not defined
.
My final objective is to see if any words from a dictionary is in "something.txt", and if yes, print the line where the word was identified.
It looks like you've used a generator: (line.strip() for line in f), I don't think you can access the inner variables 'line' from outside the generator scope, ie, outside the brackets.
Try something like:
for line in f:
if line.strip() in haystack:
print line
The specific exception you asked about is because line
doesn't exist outside the generator expression. If you want to access it, you need to keep it in the same scope as the print
statement, like this:
for line in f:
needle = line.strip()
if needle in haystack:
print line
But this isn't going to be particularly useful. It's just going to be the word from needle
plus the newline at the end. If you want to print out the line (or lines?) from haystack
that include needle
, you have to search for that line, not just ask whether needle
appears anywhere in the whole haystack
.
To literally do what you're asking for, you're going to need to loop over the lines of haystack
and check each one for needle
. Like this:
with open('something.txt') as f:
haystacks = list(f)
with open('d:\\Users\\something\\Desktop\\something\\dictionary\\entirelist.txt') as f:
for line in f:
needle = line.strip()
for haystack in haystacks:
if needle in haystack:
print haystack
However, there's a neat trick you may want to consider: If you can write a regular expression that matches any complete line that includes needle
, then you just need to print out all the matches. Like this:
with open('something.txt') as f:
haystack = f.read()
with open('d:\\Users\\something\\Desktop\\something\\dictionary\\entirelist.txt') as f:
for line in f:
needle = line.strip()
pattern = '^.*{}.*$'.format(re.escape(needle))
for match in re.finditer(pattern, haystack, re.MULTILINE):
print match.group(0)
Here's an example of how the regular expression works:
^.*Falco.*$
Of course if you want to search case-insensitively, or only search for complete words, etc., you'll need to make some minor changes; see the Regular Expression HOWTO , or a third-party tutorial, for more.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.