简体   繁体   English

Python:如何使用关键字列表在文本中搜索字符串

[英]Python: How to use list of keywords to search for a string in a text

So I'm writing a program that loops through multiple.txt files and searches for any number of pre-specified keywords.所以我正在编写一个循环多个.txt 文件并搜索任意数量的预先指定的关键字的程序。 I'm having some trouble finding a way to pass through the keywords list to be searched for.我很难找到一种方法来通过要搜索的关键字列表。

The code below currently returns the following error:下面的代码当前返回以下错误:

TypeError: 'in <string>' requires string as left operand, not list

I'm aware that the error is caused by the keyword list but I have no idea how to input a large array of keywords without it running this error.我知道该错误是由关键字列表引起的,但我不知道如何在不运行此错误的情况下输入大量关键字。

Current code:当前代码:

from os import listdir

keywords=['Example', 'Use', 'Of', 'Keywords']
 
with open("/home/user/folder/project/result.txt", "w") as f:
    for filename in listdir("/home/user/folder/project/data"):
        with open('/home/user/folder/project/data/' + filename) as currentFile:
            text = currentFile.read()
            #Error Below
            if (keywords in text):
                f.write('Keyword found in ' + filename[:-4] + '\n')
            else:
                f.write('No keyword in ' + filename[:-4] + '\n')

The error is indicated in line 10 in the above code under the commented section.错误在注释部分下的上述代码的第 10 行中指示。 I'm unsure as to why I can't call a list to be able to search for the keywords.我不确定为什么我不能调用列表来搜索关键字。 Any help is appreciated, thanks!任何帮助表示赞赏,谢谢!

try looping through the list to see if each element is in the text尝试遍历列表以查看每个元素是否在文本中

for i in range(0, len(keywords)):
    if keywords[i] in text:
        f.write('Keyword found in ' + filename[:-4] + '\n')
        break
    else:
        f.write('No keyword in ' + filename[:-4] + '\n')
        break

you cannot use in too see if a list is in a string您也不能使用in查看列表是否在字符串中

I would use regular expressions as they are purpose-built for searching text for substrings.我会使用正则表达式,因为它们是专门为在文本中搜索子字符串而构建的。

You only need the re.search block.您只需要re.search块。 I added examples of findall and finditer to demystify them.我添加了findallfinditer的示例来揭开它们的神秘面纱。

# lets pretend these 4 sentences in `text` are 4 different files
text = '''Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum'''.split(sep='. ')

# add more keywords
keywords=[r'publishing', r'industry']
regex = '|'.join(keywords)
import re
for t in text:
    lst = re.findall(regex, t, re.I) # re.I make case-insensitive
    for el in lst:
        print(el)

    iterator = re.finditer(regex, t, re.I)
    for el in iterator:
        print(el.span())

    if re.search(regex, t, re.I):
        print('Keyword found in `' + t + '`\n')
    else:
        print('No keyword in `' + t + '`\n')

Output: Output:

industry
(65, 73)
Keyword found in `Lorem Ipsum is simply dummy text of the printing and typesetting industry`

industry
(25, 33)
Keyword found in `Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book`

No keyword in `It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged`

publishing
(132, 142)
Keyword found in `It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum`

You could replace你可以更换

if (keywords in text):
   ...

with

if any(keyword in text for keyword in keywords):
   ...

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 在python中搜索关键字列表 - Search for list of keywords in python Pig:Python UDF在文本中搜索关键字/字符串列表 - Pig: Python UDF to search text for a list of keywords/strings 如何使用关键字搜索句子直到python中的字符串结尾 - How to search for a sentence using keywords till the end of a string in python 如何使用python获取Google新闻标题和搜索关键字? - How to use python to get google news headlines and search keywords? 如何使用 Python NLP 从数据库表中提取与搜索字符串中的关键字匹配的关键字 - How to Extract Keywords from a Database Table that are matching with the Keywords in search string using Python NLP Pandas:在文本列中搜索关键字列表并标记它 - Pandas: search list of keywords in the text column and tag it Python - 计算存储在列表中的关键字在文本中出现的次数 - Python - Count how many times keywords stored in a list appear in text 在Python字符串中循环搜索关键字 - Looping through Python string in search for keywords 在字符串 Python 中搜索 like 和 not like 关键字 - Search for like and not like keywords in a string Python Python - 遍历关键字列表,搜索字符串中的匹配数,计算最终总数 - Python - loop through list of keywords, search number of matches in string, count final total
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM