简体   繁体   中英

How to convert docx to pdf on Mac OS with Python?

I've looked up several SO and other web pages but I haven't found anything that works.

The script I wrote, opens a docx, changes some words and then saves it in a certain folder as a docx. However, I want it to save it as a pdf but I don't know how to.

This is an example of the code I'm working with:

# Opening the original document
doc = Document('./myDocument.docx')

# Some code which changes the doc

# Saving the changed doc as a docx
doc.save('/my/folder/myChangedDocument.docx')

The things I tried to do for it to save as a pdf:

from docx2pdf import convert

# This after it was saved as a docx
    convert('/my/folder/myChangedDocument.docx', '/my/folder/myChangedDocument.pdf')

But it says that Word needs permission to open the saved file and I have to select the file to give it the permission. After that, it just says:

0%|          | 0/1 [00:03<?, ?it/s]
{'input': '/my/folder/contractsomeVariable.docx', 'output': '/my/folder/contractsomeVariable.pdf', 'result': 'error', 'error': 'Error: An error has occurred.'}

And I tried to simply put.pdf instead of.docx after the document name when I saved it but that didn't work either as the module docx can't do that.

So does someone know how I can save a docx as a pdf using Python?

A simple way, you can use libreoffice

Ref: https://www.libreoffice.org/get-help/install-howto/macos/

And script sample:

def convert_word_to_pdf_local(folder, source, timeout=None):
    args = [
        LIBREOFFICE_BINARY_PATH,
        '--headless',
        '--convert-to',
        'pdf',
        '--outdir',
        folder,
        source,
    ]
    if check_libreoffice_exists() is False:
        raise Exception('Libreoffice not found')

    process = subprocess.run(
        args, stdout=subprocess.PIPE, stderr=subprocess.PIPE, timeout=timeout
    )
    filename = re.search('-> (.*?) using filter', process.stdout.decode())

    if filename is None:
        raise Exception('Libreoffice is not working')
    else:
        filename = filename.group(1)
        pdf_file = open(filename, 'rb')
        return pdf_file


def check_libreoffice_exists():
    s = os.system(f'{LIBREOFFICE_BINARY_PATH} --version')
    if s != 0:
        return False
    return True

you can use docx2pdf by making the changes first and then coverting.

Use pip to install on mac (I am guessing you already have it but it is still good to include).

pip install docx2pdf

Once docx2pdf is installed, you can your docx file in inputfile and put an empty.pdf file in outputfile.

from docx2pdf import convert
inputFile = "document.docx"
outputFile = "document2.pdf"
file = open(outputFile, "w")
file.close()

convert(inputFile, outputFile)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM