在文本文件中搜索列表中的每个单词并打印行

Question

I would like to search a .txt file for a "list" of words and print any line in the txt that contains any words in the wordlist. 我想在.txt文件中搜索单词的“列表”，并在txt中打印包含单词列表中任何单词的任何行。

I firstly used .split() to split out the raw_input (called userInput ) and got a wordlist. 我首先使用.split()来拆分raw_input （称为userInput ）并得到一个wordlist。 After that I filtered the current wordlist with another blacklist wordlist and got a final filtered wordlist. 之后，我用另一个黑名单wordlist过滤了当前的wordlist，得到了最终过滤的wordlist。 I want to search the text file for any of its words in this case. 在这种情况下，我想在文本文件中搜索任何单词。

exWords = ['Who', 'How', 'What', 'How many', 'How much', 'am', 'is', 'are', '?', '!']
while True:
    userInput = raw_input("> ")
    uqWords = userInput.split()
    fqWords = [word for word in uqWords if not any(bad in word for bad in exWords)]

After I took userInput apart with .split() and called it uqWords I filtered them from any words in the exWords list and called the output fqWords . 在我用.split()分开userInput并将其称为uqWords我从exWords列表中的任何单词中过滤掉它们并调用输出fqWords 。 Now I want to search Database.txt for any word in the fqWords list and print the lines. 现在我想在Database.txt中搜索fqWords列表中的任何单词并打印行。

to be specified; 指定; my full code is: 我的完整代码是：

import time
import random

Error = ["Sorry, I don't understand.", "I don't get it"]
exWords = ['Who', 'How', 'What', 'How many', 'How much', 'am', 'is', 'are', '?', '!']
R = "Rel > "

while True:
    userInput = raw_input("> ")
    uqWords = userInput.split()
    fqWords = [word for word in uqWords if not any(bad in word for bad in exWords)]
    DB = open("Database.txt")
    for line in DB:
        if fqWords in line:
            print (R + line[:-1])
    CDB = open("CodeDB.txt")
    for code in CDB:
        if fqWords in code:
            print (R + code[:-1])
            break
        if fqWords not in (code and line):
            randomError = random.choice(Error)
            print (R + (randomError))

Answer 1

Try using this function: 尝试使用此功能：

def search_for_lines(filename, words_list):
    words_found = 0
    with open(filename) as db_file:
        for line_no, line in enumerate(db_file):
            if any(word in line for word in words_list):
                print(line_no, ':', line)
                words_found += 1
    return words_found

Just pass the filename and the list of words you want to search and it will print the line number, together with the line content, and will return how many lines were found with any of the words. 只需传递您要搜索的文件名和单词列表，它就会打印行号以及行内容，并返回在任何单词中找到的行数。 enumerate will give you tuples of the line number and the line itself as the file iterates over every line. 当文件遍历每一行时， enumerate将为您提供行号和行本身的元组。

To add this to your existing code and search thought both files, you will need to first declare it, and then call it just after your assignment of fqWords like so: 要将它添加到现有代码并搜索两个文件，您需要首先声明它，然后在分配fqWords之后调用它， fqWords所示：

import random

def search_for_lines(filename, words_list):
    words_found = 0
    with open(filename) as db_file:
        for line_no, line in enumerate(db_file):
            if any(word in line for word in words_list):
                print(line_no, ':', line)
                words_found += 1
    return words_found

Error = ["Sorry, I don't understand.", "I don't get it"]
exWords = ['Who', 'How', 'What', 'How many', 'How much', 'am', 'is', 'are', '?', '!']
R = "Rel > "

while True:
    userInput = raw_input("> ")
    uqWords = userInput.split()
    fqWords = [word for word in uqWords if not any(bad in word for bad in exWords)]
    search_for_lines("Database.txt", fqWords)

    words_found = search_for_lines("CodeDB.txt", fqWords)

    if words_found > 0:
        break
    else:
        randomError = random.choice(Error)
        print (R + (randomError))

Answer 2

If you don't need to modify a list, use tuple . 如果您不需要修改列表，请使用tuple 。 And for naming identifiers see PEP 8 . 对于命名标识符，请参阅PEP 8 。
To get difference of sequences, use set , fe {1,2,3} - {2,3} is {1} . 为了得到序列的差异，使用set ，fe {1,2,3} - {2,3}是{1} 。
If you open same files within a loop, it get opened in every iteration, so better move them out of the loop. 如果在循环中open相同的文件，它会在每次迭代中打开，因此最好将它们移出循环。

import random

def get_line_with_words(lines, words):

    """returns list of lines if any of the words
       in any of the lines
    """
    return [(i, line.strip()) for i, line in enumerate(lines,1) if any(word in line for word in words)]

errors = ("Sorry, I don't understand.", "I don't get it")
ex_words = ('Who', 'How', 'What', 'How many', 'How much', 'am', 'is', 'are', '?', '!')
prefix = "Rel > "

with open("Database.txt") as db, open("CodeDB.txt") as cdb:
    while True:
        user_input = raw_input("> ")
        uq_words = user_input.split()
        fq_words = frozenset(uq_words) - frozenset(ex_words)

        res1 = get_line_with_words(db, fq_words)
        res2 = get_line_with_words(cdb, fq_words)

        if res1 and res2:
            for n, line in res1 + res2:
                print('{} {} {}'.format(prefix, n, line)
            break

        print('{} {}'.format(prefix, random.choice(errors)))
        db.seek(0)
        cdb.seek(0)

在文本文件中搜索列表中的每个单词并打印行

问题描述

2 个解决方案

解决方案1
3 2015-09-02 19:34:58

解决方案2
0 2015-09-03 15:35:29

在文本文件中搜索列表中的每个单词并打印行

问题描述

2 个解决方案

解决方案1 3 2015-09-02 19:34:58

解决方案2 0 2015-09-03 15:35:29

解决方案1
3 2015-09-02 19:34:58

解决方案2
0 2015-09-03 15:35:29