简体   繁体   English

Nodejs:将文档转换为 PDF

[英]Nodejs: Convert Doc to PDF

I found some repos, which do not look as they are still maintained:我发现了一些 repos,它们看起来不像它们仍在维护:

I tried the approach with libreoffice , but the pdf output is so bad, that it is not useable (text on diff. pages etc.).我尝试了libreoffice方法,但 pdf 输出非常糟糕,无法使用(差异页面上的文本等)。

If possible I would like to avoid starting any background processes and/or saving the file on the server.如果可能,我想避免启动任何后台进程和/或将文件保存在服务器上。 Best would be solution where I can use buffers.最好的解决方案是我可以使用缓冲区。 For privacy reasons, I cannot use any external service.出于隐私原因,我不能使用任何外部服务。

doc buffer -> pdf buffer

Question:问题:

How to convert docs to pdf in nodejs?如何在nodejs中将文档转换为pdf?

For those who might stumble on this question nowadays:对于那些现在可能会偶然发现这个问题的人:

There is cool tool called Gotenberg — Docker-powered stateless API for converting HTML, Markdown and Office documents to PDF.有一个很酷的工具叫做Gotenberg ——Docker 驱动的无状态 API,用于将 HTML、Markdown 和 Office 文档转换为 PDF。 It supports converting DOCs via unoconv .它支持通过unoconv转换 DOC。

And I am happen to be an author of JS/TS client for Gotenberg — gotenberg-js-client我碰巧是 Gotenberg 的 JS/TS 客户端的作者——gotenberg-js-client

I welcome you to use it :)我欢迎你使用它:)

While I was creating an application I need to convert the doc or docx file uploaded by a user into a pdf file for further analysis.在创建应用程序时,我需要将用户上传的 doc 或 docx 文件转换为 pdf 文件以供进一步分析。 I used npm package libreoffice-convert for this purpose.为此,我使用了 npm 包 libreoffice-convert。 libreoffice-convert requires libreoffice to be installed on your Linux machine. libreoffice-convert 要求在您的 Linux 机器上安装 libreoffice。 Here is a sample code that I have used.这是我使用的示例代码。 This code is written in javascript for nodejs based application.这段代码是用 javascript 编写的,用于基于 nodejs 的应用程序。

const libre = require('libreoffice-convert');
const path = require('path');
const fs = require('fs').promises;
let lib_convert = promisify(libre.convert)

async function convert(name="myresume.docx") {
  try {
    let arr = name.split('.')
    const enterPath = path.join(__dirname, `/public/Resume/${name}`);
    const outputPath = path.join(__dirname, `/public/Resume/${arr[0]}.pdf`);
    // Read file
    let data = await fs.readFile(enterPath)
    let done = await lib_convert(data, '.pdf', undefined)
    await fs.writeFile(outputPath, done)
    return { success: true, fileName: arr[0] };
  } catch (err) {
    console.log(err)
    return { success: false }
  }
}

You will get a very good quality of pdf.您将获得质量非常好的pdf。

Belated answer, but you could now try https://www.npmjs.com/package/@nativedocuments/docx-wasm which we have just released (January 2019). 迟来的答案,但您现在可以尝试我们刚刚发布的https://www.npmjs.com/package/@nativedocuments/docx-wasm(2019年 1月)。

It'll perform the conversion locally, and doesn't require LibreOffice, unoconv or anything else. 它将在本地执行转换,不需要LibreOffice,unoconv或其他任何东西。

const fs = require('fs');
const docx = require("@nativedocuments/docx-wasm");

// init docx engine
docx.init({
    // ND_DEV_ID: "XXXXXXXXXXXXXXXXXXXXXXXXXX",    // goto https://developers.nativedocuments.com/ to get a dev-id/dev-secret
    // ND_DEV_SECRET: "YYYYYYYYYYYYYYYYYYYYYYYYYY", // you can also set the credentials in the enviroment variables
    ENVIRONMENT: "NODE", // required
    LAZY_INIT: true      // if set to false the WASM engine will be initialized right now, usefull pre-caching (like e.g. for AWS lambda)
}).catch( function(e) {
    console.error(e);
});

async function convertHelper(document, exportFct) {
    const api = await docx.engine();
    await api.load(document);
    const arrayBuffer = await api[exportFct]();
    await api.close();
    return arrayBuffer;
}

convertHelper("sample.docx", "exportPDF").then((arrayBuffer) => {
    fs.writeFileSync("sample.pdf", new Uint8Array(arrayBuffer));
}).catch((e) => {
    console.error(e);
});

As you can see from the above code you'll need an API key (freemium model). 从上面的代码中可以看出,您需要一个API密钥(免费增值模型)。

To convert a document into PDF we can use Universal Office Converter (unoconv) command line utility.要将文档转换为 PDF,我们可以使用Universal Office Converter (unoconv)命令行实用程序。

It can be installed on your OS by any package manager eg To install it on ubuntu using apt-get它可以通过任何包管理器安装在您的操作系统上,例如使用 apt-get 在 ubuntu 上安装它

sudo apt-get install unoconv

As per documentation of unoconv根据 unoconv 的文档

If you installed unoconv by hand, make sure you have the required LibreOffice or OpenOffice packages installed如果您手动安装了 unoconv,请确保安装了所需的 LibreOffice 或 OpenOffice 软件包

Following example demonstrate how to invoke unoconv utility以下示例演示如何调用 unoconv 实用程序

unoconv -f pdf sample_document.py

It generates PDF document that contains content of sample_document.py它生成包含 sample_document.py 内容的 PDF 文档

If you want to use a nodeJS program then you can invoke the command through child process如果你想使用 nodeJS 程序,那么你可以通过子进程调用命令

Find code below that demonstrates how to use child process for using the unoconv for creating PDF在下面找到演示如何使用子进程使用 unoconv 创建 PDF 的代码

const util = require('util');
const exec = util.promisify(require('child_process').exec);

async function createPDFExample() {
  const { stdout, stderr } = await exec('unoconv -f pdf sample.js');
  console.log('stdout:', stdout);
  console.log('stderr:', stderr);
}

createPDFExample();

Posting a slightly modified version for excel, based upon the answer provided by @shubham singh.根据@shubham singh 提供的答案,为 excel 发布一个稍微修改过的版本。 I tried it and it worked perfectly.我试过了,效果很好。

    const fs = require('fs').promises;
    const path = require('path');
    const { promisify } = require('bluebird');
    const libre = require('libreoffice-convert');
    const libreConvert = promisify(libre.convert);

        // get current working directory
        let workDir = path.dirname(process.mainModule.filename)
        // read excel file
        let data = await fs.readFile(
          `${workDir}/my_excel.xlsx`
        );
        // create pdf file from excel
        let pdfFile = await libreConvert(data, '.pdf', undefined);
        // write new pdf file to directory
        await fs.writeFile(
          `${workDir}/my_pdf.pdf`,
          pdfFile
        );

Docx to pdf A library that converts docx file to pdf. Docx to pdf 一个将 docx 文件转换为 pdf 的库。

Installation:安装:

npm install docx-pdf --save

Usage用法

 var docxConverter = require('docx-pdf');

   docxConverter('./input.docx','./output.pdf',function(err,result){
   if(err){
      console.log(err);
     }
    console.log('result'+result);
 });

its basically docxConverter(inputPath,outPath,function(err,result){
  if(err){
   console.log(err);
  }
   console.log('result'+result);
 });

Output should be output.pdf which will be produced on the output path your provided输出应该是 output.pdf ,它将在您提供的输出路径上生成

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM