简体   繁体   English

如何在Python中的文本文件中搜索特定单词

[英]How to search a text file for a specific word in Python

I want to find words in a text file that match words stored in an existing list called items, the list is created in a previous function and I want to be able to use the list in the next function as well but I'm unsure how to do that, I tried using classes for that but i couldn't get it right. 我想在文本文件中找到与存储在称为项的现有列表中的单词匹配的单词,该列表是在上一个函数中创建的,并且我也希望能够在下一个函数中使用该列表,但是我不确定如何为此,我尝试为此使用类,但我做对了。 And I can't figure out what the problem is with the rest of the code. 我无法弄清楚其余代码的问题所在。 I tried running it without the class and list and replaced the list 'items[]' in line 8 with a word in the text file being opened and it still didn't do anything, even though no errors come up. 我尝试在没有类和列表的情况下运行它,并用打开的文本文件中的一个单词替换了第8行中的列表“ items []”,尽管没有出现错误,它仍然没有执行任何操作。 When the below code is run it prints out: "Please entre a valid textfile name: " and it stops there. 运行以下代码时,它会打印出:“请输入一个有效的文本文件名称:”,并在此处停止。

class searchtext():
    textfile = input("Please entre a valid textfile name: ")
    items = []

    def __init__search(self):
        with open("textfile") as openfile:
            for line in openfile:
                for part in line.split():
                    if ("items[]=") in part:
                        print (part)
                    else:
                        print("not found") 

The list is created from another text file containing words in a previous function that looks like this and it works as it should, if it is to any help: 该列表是从另一个文本文件创建的,该文件包含上一个函数的单词,该函数看起来像这样,并且在需要任何帮助的情况下也应按预期工作:

def createlist():
    items = []
    with open('words.txt') as input:
        for line in input:
            items.extend(line.strip().split(','))
    return items

print(createlist())

You can use regexp the following way: 您可以通过以下方式使用regexp:

    >>> import re
    >>> words=['car','red','woman','day','boston']
    >>> word_exp='|'.join(words)
    >>> re.findall(word_exp,'the red car driven by the woman',re.M)
    ['red', 'car', 'woman']

The second command creates a list of acceptable words separated by "|". 第二个命令创建由“ |”分隔的可接受单词的列表。 To run this on a file, just replace the string in 'the red car driven by the woman' for open(your_file,'r').read() . 要在文件上运行此文件,只需将“女人开车的红色汽车”中的字符串替换为open(your_file,'r').read()

This may be a bit cleaner. 这可能会更清洁。 I feel class is an overkill here. 我觉得上课太夸张了。

def createlist():
    items = []
    with open('words.txt') as input:
        for line in input:
            items.extend(line.strip().split(','))
    return items

print(createlist())
# store the list
word_list = createlist()

with open('file.txt') as f:
    # split the file content to words (first to lines, then each line to it's words)
    for word in (sum([x.split() for x in f.read().split('\n')], [])):
        # check if each word is in the list
        if word in word_list:
            # do something with word
            print word + " is in the list"
        else:
            # word not in list
            print word + " is NOT in the list"

There is nothing like Regular expressions in matching https://docs.python.org/3/howto/regex.html 在匹配https://docs.python.org/3/howto/regex.html时,没有像正则表达式这样的东西

items=['one','two','three','four','five'] #your items list created previously
import re
file=open('text.txt','r') #load your file
content=file.read() #save the read output so the reading always starts from begining
for i in items:
    lis=re.findall(i,content)
    if len(lis)==0:
        print('Not found')
    elif len(lis)==1:
        print('Found Once')
    elif len(lis)==2:
        print('Found Twice')
    else:
        print('Found',len(lis),'times')

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM