简体   繁体   English

在文件目录中搜索关键字

[英]search for keywords in file directory

I'm having difficulties to find the filenames that contain a keyword from a predetermined list.我很难从预定列表中找到包含关键字的文件名。 I manage to find all files from the directory and subdirectories, however finding keywords is not functioning.我设法从目录和子目录中找到所有文件,但是查找关键字不起作用。 I would like to list all the files in my directory that contain a keyword in it's name:我想列出我目录中名称中包含关键字的所有文件:

path = ':O drive'
keywords = ['photo', 'passport', 'license']
files = [] # list of all files in directory
result = []  # list store our results

#list all files
for root, directories, file_path in os.walk(path, topdown=False):
        for name in file_path:
            files.append(os.path.join(root, name))

#find keywords in files            
for filename in files:
    for keyword in keywords:
        if keyword in filename:
            result[keyword].os.path.join(path, filename)

print(result)

You could just use glob for this, subbing the pattern with the keyword for each loop.您可以为此使用 glob,用每个循环的关键字替换模式。 This would create a dict of lists.这将创建一个列表字典。 In this case I have 1 file named photo.txt and 2 files named license 2019.txt and license 2020.txt in the working directory of the script.在这种情况下,我在脚本的工作目录中有 1 个名为photo.txt的文件和 2 个名为license 2019.txtlicense 2020.txt的文件。

from glob import glob
import os
keywords = ['photo','license', 'passport']

path = './'

output = {}
for word in keywords:
    output[word] = glob(os.path.join(path,f'*{word}*'))

Output Output

{'photo': ['.\\photo.txt'],
 'license': ['.\\license 2019.txt', '.\\license 2020.txt'],
 'passport': []}

If you just want a list of filenames that contain any of the keywords in your list, try list comprehension:如果您只想要包含列表中任何关键字的文件名列表,请尝试列表理解:

result = [os.path.join(path, f) for f in files if any(k in f.lower() for k in keywords)]

If you want a dictionary, do this instead:如果您想要字典,请改为执行以下操作:

result = dict()
for k in keywords:
    result[k] = [os.path.join(path, f) for f in files if k in f.lower()]

You can use Regex.您可以使用正则表达式。

>>> import re
>>> keywords = ['photo', 'passport', 'license']
>>> m = re.findall(f"^.*[{'|'.join(keywords)}].*$", 'example-photo.jpg')
>>> m
print(m)

https://docs.python.org/3/library/re.html https://docs.python.org/3/library/re.html

import os
for dirpath, dirs, files in os.walk(":O drive"): 
  keywords=['photo', 'passport', 'license','file']
  for filename in files:
    f = os.path.join(dirpath,filename)
    for keyword in keywords:        
      if keyword in f:
          try:
            print(f)
          except OSError as e:
            print("Error: %s : %s" % (f, e.strerror))

This is my proposed solution to your problem.这是我为您的问题提出的解决方案。 And it also shows the full path of the file we are searching for, with try and catch if something unusual happens.它还显示了我们正在搜索的文件的完整路径,如果发生异常情况,请使用 try and catch。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 使用文件中的关键字在搜索引擎上使用Scrapy - Using Scrapy on search engines using keywords in a file 在列表文件 Python 中搜索多个关键字的文件 - Search file for multiple keywords in listfile Python 尝试将关键字从.txt文件保存到数组,然后使用该数组在另一个文档中搜索关键字 - Trying to save keywords from a .txt file to an array and use that array to search another doc for the keywords Python:在从电子书转换而来的txt文件中搜索关键字,然后替换关键字。 - Python: Search for keywords in a txt file that was converted from ebook and replace keywords. 使用python文件中的关键字对.pdf和.ppt执行Google搜索 - Performing a Google search for .pdf and .ppt using keywords from a file in python 在大型tar.gz文件中搜索关键字,然后复制和删除 - Search large tar.gz file for keywords,copy and delete 在 2 个关键字之间搜索存储过程文件以解析所有 SQL - Stored Procedure File search between 2 keywords to parse all SQL 使用用户输入的关键字在 CSV 文件中搜索项目 - Search for item in a CSV file using User-inputted Keywords 尝试从日志文本(.txt)文件中搜索不区分大小写的关键字 - Trying to search case insensitive keywords from a log text (.txt) file 在 powerpoint 目录中搜索关键字 - searching for keywords in a directory of powerpoints
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM