简体   繁体   中英

How to convert from docx to pdf with a python function (WINDOWS)?

I am working on an env with a python function to convert docx to pdf files. I'm using postman to send base64. Then I mount the docx file (everything works yet), but when it converts the docx file into pdf, an error shows up. I'm thinking that is because I do not have Office on my env? How can I fix it without having office? Thanks.

import sys
import os
import comtypes.client
import pythoncom
import uuid
import requests
from docx import Document
import base64
from os import listdir
from os.path import isfile, join
import azure.functions as func

def main(req: func.HttpRequest) -> func.HttpResponse:
  bytesDoc = req.get_json()['base']

  path = '/users/echornet/pruebas/'
  newFile = open(path + 'prueba.docx','wb')
  newFile.write(base64.b64decode(bytesDoc))

  newFile.close()
  wdFormatPDF = 17

  out_file = path + 'prueba.pdf'
  word = comtypes.client.CreateObject('Word.Application')

  doc = word.Documents.Open(newFile)
  doc.SaveAs(out_file, FileFormat=wdFormatPDF)
  doc.Close()

This is the error I'm getting. I get the docx created from base64, but no conversion.

System.Private.CoreLib: Exception while executing function: Functions.FunConverter. System.Private.CoreLib: Result: Failure Exception: AttributeError: module 'comtypes.gen.Word' has no attribute '_Application' Stack: File "C:\\PruebaFunction\\ConvEnv\\lib\\site-packages\\azure\\functions_worker\\dispatcher.py", line 288, in _handle__invocation_request self. run_sync_func, invocation_id, fi.func, args) File "C:\\Users\\echornet\\AppData\\Local\\Programs\\Python\\Python36\\lib\\concurrent\\futures\\thread.py", line 55, in run result = self.fn(*self.args, **self.kwargs) File "C:\\PruebaFunction\\ConvEnv\\lib\\site-packages\\azure\\functions_worker\\dispatcher.py", line 347, in __run_sync_func return func(**params) File "C:\\PruebaFunction\\FunConverter__init .py", line 32, in main word = comtypes.client.CreateObject('Word.Application') File "C:\\PruebaFunction\\ConvEnv\\lib\\site-packages\\comtypes\\client__init__.py", line 250, in CreateObject return _manage(obj, clsid, interface=interface) File "C:\\PruebaFunction\\ConvEnv\\lib\\site-packages\\comtypes\\client__init__.py", line 188, in _manage obj = GetBestInterface(obj) File "C:\\PruebaFunction\\ConvEnv\\lib\\site-packages\\comtypes\\client__init__.py", line 112, in GetBestInterface interface = getattr(mod, itf_name)

U can try lib win32com to finish that

# -*- encoding: utf-8 -*-
import  os
from win32com import client
#pip instatll win32com
def doc2pdf(doc_name, pdf_name):
    """
    :word to pdf
    :param doc_name word file name
    :param pdf_name to_pdf file name
    """
    try:
        word = client.DispatchEx("Word.Application")
        if os.path.exists(pdf_name):
            os.remove(pdf_name)
        worddoc = word.Documents.Open(doc_name,ReadOnly = 1)
        worddoc.SaveAs(pdf_name, FileFormat = 17)
        worddoc.Close()
        return pdf_name
    except:
        return 1
if __name__=='__main__':
    doc_name = "f:/test.doc"
    ftp_name = "f:/test.pdf"
    doc2pdf(doc_name, ftp_name)

我在Windows Server 2016上运行的计划的word-> pdf转换python脚本上有类似的代码。当我通过命令行运行脚本时,它工作得很好,但是,在计划的任务上,pdf转换无法运行并且没有错误已标记。

You can use the python library docx2pdf which internally uses win32com : https://github.com/AlJohri/docx2pdf

Install:

pip install docx2pdf

Usage:

from docx2pdf import convert
convert("input.docx", "output.pdf")

As you mentioned, this approach does require having Microsoft Office installed.

Disclaimer: I wrote this library and command line tool.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM