简体   繁体   中英

How to use unoconv with a newer version of libreoffice

I am trying to convert encrypted documents (doc/docx) into PDF using python.

What I do is:

  • first decrypt them temporarily in a separate folder
  • use the unoconv command line to convert the decrypted file into pdf:

unoconv -f pdf -eSelectPdfVersaion=1 [path-to-file]

The conversion runs, but I notice that in the doc and docx files there is a change in the appearance of the documents (both the decrypted file and the pdf) which does not affect the original encrypted file (I tested it by simply decrypting the file from a windows client and the decrypted file as it originally was).

The appearance is basically a change in the document style which affects the amount of pages. For example a 13-pages Word document is decrypted into 14-pages of Word document and converted to a PDF file of 14 pages. Similarly a 348-page doc file gets converted into a 330-pages doc file and then a 330-pages PDF file.

I discovered that there is a slight incompatibility of styles between Microsoft Word and the version of LibreOffice installed with Unoconv (4.3). Doing my tests I noticed that fonts get changed to LibreOffice compatible ones that are slightly different in size than the original ones.

I installed a later version of LibreOffice (5.1, 5.3) and in my tests the decrypted doc/docx file had the proper formatting and page numbers, but the unoconv does not utilize the newer version and sticks to 4.3, thus producing the PDF file with incorrect styling and pages number.

I tried to use the:

soffice --headless --convert-to pdf [path-to-file] --outdir [path-to-export-directory]

But it does nothing.

  1. Is there a way to utilize unoconv with a LibreOffice version other than the 4.3?

  2. Is there a way to make the --convert-to command to work with LibreOffice 5.1 or even 5.3?

Here are few steps you could try: Uninstall the older version of libreoffice using

sudo apt remove libreoffice*

Install the latest version of libreoffice using

sudo add-apt-repository ppa:libreoffice/ppa
sudo apt-get update
sudo apt-get install libreoffice

To check if libreoffice is installed successfully type

libreoffice --version

This should return the version number

Next install Microsoft fonts using

sudo apt install ttf-mscorefonts-installer

Also install any other font dependencies that you anticipate your documents could come with

Finally use the below command to convert to pdf. Make sure no libreoffice application is running in the background

libreoffice --headless --invisible --convert-to pdf "test.docx" --outdir files

You should find the pdf in the folder called files

This works on ubuntu 18.04.5 LTS.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM