简体   繁体   中英

How to replace multiple words in .docx file and save the docx file using python-docx

I'm trying to change the content of the docx using python-docx library. My changes are about replacing the words. So, I have list of words Original word list: ['ABC','XYZ'] which I need to replace with revised word list: ['PQR', 'DEF'] . I also need to preserve the format of these words. Right now, I can save only one change. Here is my code for the reference.

def replace_string(filename='test.docx'):
doc = Document(filename)
list= ['ABC','XYZ']
list2 = ['PQR','DEF']
for p in doc.paragraphs:
        print(p.text)
        for i in range(0, len(list)):
            if list[i] in p.text:
                print('----!!SEARCH FOUND!!------')
                print(list[i])
                print(list2[i])
                print('\n')
                inline = p.runs
                # Loop added to work with runs (strings with same style)
                for i in range(len(inline)):
                    #print(inline[i].text)
                    if list[i] in inline[i].text:
                        print('----SEARCH FOUND!!------')
                        text = inline[i].text.replace(list[i], list2[i])
                        inline[i].text = text
                        print(inline[i].text)
        doc.save('dest1.docx')
return 1

replace_string()

Original content of test.docx file:

ABC XYZ

Revised content or saved content of dest1.docx file:

PQR XYZ

How can I save all the replacements? The list of word may increase and its size is not fix.

This following code works for me. This preserve the format as well. Hope this will help others.

def replace_string1(filename='test.docx'):
doc = Document(filename)
list= ['ABC','XYZ']
list2 = ['PQR','DEF']
for p in doc.paragraphs:
    inline = p.runs
    for j in range(0,len(inline)):
        for i in range(0, len(list)):
            inline[j].text = inline[j].text.replace(list[i], list2[i])
            print(p.text)
            print(inline[j].text)
doc.save('dest1.docx')
return 1

I implemented a version of JT28's solution, using a dictionary to replace the text (instead of two lists) - this lets me generate paired find, replace items more simply. Key is what I'm looking for, and v is what is in the new substring. The function allows replacement in one paragraph or all paragraphs, depending on whether the caller is iterating (or not) over doc.paragraphs.

# NEW FUNCTION:
def replacer(p, replace_dict):
    inline = p.runs  # Specify the list being used
    for j in range(0, len(inline)):

        # Iterate over the dictionary
        for k, v in replace_dict.items():
            if k in inline[j].text:
                inline[j].text = inline[j].text.replace(k, v)
    return p

# Replace Paragraphs
doc = Document(filename)  # Get the file
dict = {'ABC':'PQR', 'XYZ':'DEF'}  # Build the dict
for p in doc.paragraphs:  # If needed, iter over paragraphs
    p = replacer(p, dict)  # Call the new replacer function

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM