简体   繁体   English

在word文档中搜索单词并打印出包含该单词的文件名?

[英]Search word in word documents and print out the file name that contains that word?

Hey so I am new to Python and I wanted to make a script that retrieves the file name from a list of docx documents in a large directory if a file contains a certain word inside the word document.嘿,所以我是 Python 的新手,如果文件在 word 文档中包含某个单词,我想制作一个脚本,从大目录中的 docx 文档列表中检索文件名。

Here is my code below so far到目前为止,这是我的代码

import os
import docx2txt
os.chdir('C:/Users/epicr/Desktop/Python Stuff/LAB FILES')
text= ''
files = []
for file in os.listdir('C:/Users/epicr/Desktop/Python Stuff/LAB FILES'):
    if file.endswith('.docx'):
        files.append(file)
for i in range(len(files)):
        text += docx2txt.process(files[i])
if text == str('VENTILATION RATIO'):
    print (i)

My thought process is to convert all these docx documents to txt files then search the files for the word that contains 'VENTILATION RATIO'.我的想法是将所有这些 docx 文档转换为 txt 文件,然后在文件中搜索包含“VENTILATION RATIO”的单词。 If the word exists in the files, then the file name containing the file will print.如果文件中存在该单词,则将打印包含该文件的文件名。

However the output doesn't print out anything.但是,输出不会打印出任何内容。 I know for a fact that in at least one of the Word Documents, there is a word: 'VENTILATION RATIO' (and yes, it is case sensitive) in it我知道至少在一个 Word 文档中,有一个词:“VENTILATION RATIO”(是的,它区分大小写)

There may be a logic issue in your code.您的代码中可能存在逻辑问题。

Try this update:试试这个更新:

import os
import docx2txt
os.chdir('C:/Users/epicr/Desktop/Python Stuff/LAB FILES')
text= ''
files = []
for file in os.listdir('C:/Users/epicr/Desktop/Python Stuff/LAB FILES'):
    if file.endswith('.docx'):
        files.append(file)
for i in range(len(files)):
    text = docx2txt.process(files[i])  # text for single file
    if 'VENTILATION RATIO' in text:
         print (i, files[i])  # file index and name

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM