简体   繁体   中英

How do i get docx2txt to process all docx files in directory?

I'm using the docx2txt module in python2.7 and I'm trying to get it to process all of the docx files in one directory. Currently I have doc2txt.process("THE NAME OF THE DOCUMENT.docx")

I want to process all docx files in the current working directory but I'm not sure how to do that

I have inserted my code below. It prints out the name of the file and the text in the docx file.

import os
import docx2txt

os.chdir('c:/users/Says/desktop')

files = []

path = 'c:/users/Says/desktop'



my_text = docx2txt.process("test.docx")

for files in os.listdir(path):
    if files.endswith('docx'):
        print(files)
        print(my_text)

You're half way there.

Create a list to store all the files that you find:

files = []
for file in os.listdir(path):
    if file.endswith('.docx'):
        files.append(file)

Then you can use a for statement to loop through all the files and open them one at a time:

for i in range(len(files)):
    text = docx2txt.process(files[i])
    # Do something with the text.

If you want to change your code to allow the use of the current working directory you can set your path to:

path = os.getcwd()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM