简体   繁体   English

是否可以仅使用免费软件将文档转换为 PDF 或在 C# 中编辑 PDF?

[英]Is it at all possible to convert a document to PDF or edit a PDF in C# using only free software?

I had this stupid idea of creating a template as a .docx or .rtf or .pdf and then replacing the text in that document to generate reports.我有一个愚蠢的想法,即创建一个 .docx 或 .rtf 或 .pdf 的模板,然后替换该文档中的文本以生成报告。 This seemed like a better way of doing it than using paid reporting software.这似乎比使用付费报告软件更好。

Well, I believe I've tried just about everything now and I'm amazed at how impossible it is to do anything with pdfs.好吧,我相信我现在已经尝试了几乎所有的东西,我很惊讶用 pdf 做任何事情是多么不可能。

Try 1试试 1

HTML -> PDF HTML -> PDF

A lot harder to design the template.设计模板要困难得多。 It doesn't look the same when you print it.打印时看起来不一样。 Never got it working outside of a command line example (not sure how well, say, iTextSharp-LGPL would even work or if it could handle base64 strings as I'm not sure how else you are going to tell it about images).从来没有让它在命令行示例之外工作(不确定 iTextSharp-LGPL 甚至可以工作得有多好,或者它是否可以处理 base64 字符串,因为我不确定你将如何告诉它有关图像的信息)。 In any case, doing it this way makes it too hard to design the template.无论如何,这样做会使模板设计变得过于困难。

Try 2试试 2

OpenXml -> PDF OpenXml -> PDF

I stupidly assumed that because Word could save as PDF that OpenXml could to.我愚蠢地认为,因为 Word 可以保存为 OpenXml 可以保存的 PDF。 I was wrong.我错了。 It cannot save as a PDF.它不能保存为 PDF。

Try 3尝试 3

OpenOffice/LibreOffice (docX -> PDF) OpenOffice/LibreOffice (docX -> PDF)

It can't read OpenXml which is a problem because I was editing the template as OpenXml and then saving that result (as a .docx) but it can't read that saved document.它无法读取 OpenXml,这是一个问题,因为我将模板编辑为 OpenXml,然后将结果保存(作为 .docx),但它无法读取保存的文档。

Try 4尝试 4

iTextSharp LGPL iTextSharp LGPL

This one just doesn't work, lol.这个不行,呵呵。 And apparently even though when you google "convert rtf to pdf" the ONLY thing that comes up is iText and its derivatives it doesn't convert rtf documents to pdf documents.显然,即使当你用谷歌搜索“将 rtf 转换为 pdf”时,唯一出现的是 iText 及其衍生产品,它不会将 rtf 文档转换为 pdf 文档。 I verified this myself (it only saves the text not the formatting) and later found this post to convince me I wasn't doing something wrong.我自己验证了这一点(它只保存文本而不是格式),后来发现这篇文章让我相信我没有做错什么。

Try 5尝试 5

PDF -> PDF PDF -> PDF

Since converting ANYTHING to a PDF seems to be impossible maybe I can save the template as a PDF and just do a text replace on that.由于将任何内容转换为 PDF 似乎是不可能的,也许我可以将模板保存为 PDF 并对其进行文本替换。 Nope, lol, that is apparently a very difficult thing to do .不,大声笑, 这显然是一件非常困难的事情

Try 6尝试 6

Pandoc (.odt/.docx -> pdf), (.rtf -> .pdf not supported) Pandoc (.odt/.docx -> pdf), (.rtf -> .pdf 不支持)

pandoc mockup2.odt -s -o mockup2.pdf

link to the files in the picture. 链接到图片中的文件。 *note, it messes up in the same way if you try converting .odt/.docx to .tex. *注意,如果您尝试将 .odt/.docx 转换为 .tex,它会以同样的方式搞砸。 在此处输入图像描述

What do I do here?我在这里做什么? Buy software so that I can save a file as PDF?购买软件以便我可以将文件保存为 PDF? Is that the only option?那是唯一的选择吗?

I have a solution.我有一个解决方案。 I'm not saying it's the best solution.我并不是说这是最好的解决方案。 LibreOffice (or possibly OpenOffice if you are so inclined) accepts command line arguments that will do the switch. LibreOffice(或者如果您愿意,可能是 OpenOffice)接受将执行切换的命令行参数。

soffice.exe --headless --convert-to pdf mockup.odt

*note - this is after I added libreoffice to my path ( C:\Program Files\LibreOffice\program ). *注意 - 这是在我将 libreoffice 添加到我的路径( C:\Program Files\LibreOffice\program )之后。 idk why it's called soffice.exe instead of libreoffice.exe.我知道为什么它被称为 soffice.exe 而不是 libreoffice.exe。

I might have a working solution for you, if you are stuck with the docx-file for the template.如果您坚持使用模板的 docx 文件,我可能会为您提供一个可行的解决方案。 I found one free solution for docx to pdf conversions, without using microsoft.interop, etc.: See first answer in this stack overflow post我找到了一个免费的 docx 到 pdf 转换的解决方案,而不使用 microsoft.interop 等: 请参阅此堆栈溢出帖子中的第一个答案

It uses two tools: The open xml power tools and DinkToPdf (Which is essentially a wkhtmltopdf wrapper).它使用两个工具:open xml power tools 和 DinkToPdf(本质上是一个 wkhtmltopdf 包装器)。 The html to pdf part works just fine, but the docx to html part looks like a catastrophe at first. html 到 pdf 部分工作得很好,但是 docx 到 html 部分起初看起来像是一场灾难。 You can fix this with custom css (There are some resources online).您可以使用自定义 css 解决此问题(网上有一些资源)。

Powertools-.NetStandard Powertools-.NetStandard

DinkToPdf-GitHub DinkToPdf-GitHub

There are more possibilities for proprietary software, like Asposes.Words and Syncfusion file-formats.专有软件有更多可能性,例如 Asposes.Words 和 Syncfusion 文件格式。 Most of the proprietary solutions are pretty expensive...大多数专有解决方案都非常昂贵......

If you are just working on a Windows Environment, where MS-Office is installed, you can use Microsoft.Interop.如果您只是在安装了 MS-Office 的 Windows 环境中工作,则可以使用 Microsoft.Interop。 It is by far the easiest solution (In this post, Interop is mentioned several times Stackoverflow Word to PDF这是迄今为止最简单的解决方案(在这篇文章中,Interop 多次提到Stackoverflow Word to PDF

If you found another (better) working solution, please let me know.如果您找到另一个(更好的)工作解决方案,请告诉我。 I still have not decided if I will use a proprietary or a free solution.我还没有决定是使用专有解决方案还是免费解决方案。 :-) :-)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM