简体   繁体   English

将大量Word文档(.doc,docx,.odt)转换为pdf文档

[英]Converting lots of Word documents (.doc, docx, .odt) to pdf documents

I already have a solution for this task: 我已经有解决此任务的方法:
I use the Word Interop classes, create n word instances and let them convert (save as) all my files. 我使用Word Interop类,创建n个单词实例,然后让它们转换(另存为)我的所有文件。 (Where n is the amount of threads which can be changed to reduce performance load or increase performance) (其中n是可以更改以减少性能负载或提高性能的线程数量)

This works with the following speeds: 这可以以下速度运行:
100 files -> 15,6 Sekunden 100档案-> 15,6 Sekunden
1.000 -> 156s = ~2 ½ Minuten 1.000-> 156s =〜2.5分钟
10.000 -> 1562s = 26 Minuten 10.000-> 1562s = 26分钟

As you can see it's rather slow... 如您所见,它相当慢...

What alternatives could I look into to speed up this process? 我可以寻找哪些替代方案来加快此过程?
It can be in Java or C#. 它可以是Java或C#。
It must have equal conversion accuracy compared to MS Word. 与MS Word相比,它必须具有相同的转换精度。

The faster approach I found for creating Word documents is using an XSLT stylesheet to transform the data from an XML source. 我发现创建Word文档的更快方法是使用XSLT样式表来转换XML源中的数据。 I don't have time measurements, but it's much faster than COM Interop. 我没有时间测量,但是它比COM Interop快得多。

http://msdn.microsoft.com/en-us/library/ee872374(v=office.12).aspx http://msdn.microsoft.com/en-us/library/ee872374(v=office.12).aspx

http://www.developer.com/xml/article.php/3798066/Take-the-Pain-out-of-Creating-Word-Documents-by-Using-C-and-XML.htm http://www.developer.com/xml/article.php/3798066/Take-the-Pain-out-of-Creating-Word-Documents-by-Using-C-and-XML.htm

Also, beware that using Office Automation is not a supported scenario for web sites or unattended applications. 另外,请注意,网站或无人参与应用程序不支持使用Office Automation。

http://support.microsoft.com/kb/257757 http://support.microsoft.com/kb/257757

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 不使用Word读取.net中的.doc,.docx,.pdf,.rtf文件 - reading .doc, .docx, .pdf, .rtf documents in .net without Word 如何使用Sendkeys方法将“ &lt;”字符发送到Word文档(.doc,docx) - How to send “<” characters to Word documents (.doc, docx) using Sendkeys method 连接Word文档并将其转换为pdf - concatenating word documents and converting them to pdf 需要在C#中成千上万个文档(.doc,.docx,.pdf)中搜索社会保险号 - need to search for social security number in thousands of documents (.doc,.docx,.pdf) in C# 将docx文件转换为WPF流文档 - Converting docx files to WPF Flow Documents Novacode Docx 合并多个单词文档 - Novacode Docx Merge multiple word Documents 比较Word文档(.docx)与文档模板(.dotx) - Compare word documents(.docx) with a document template(.dotx) 在C#中将PDF,Doc和Docx转换为rtf - Converting PDF, Doc and Docx to rtf in c# 从与UWP兼容的常用文档格式(主要是rtf,doc,docx,pdf,epub,mobi)中提取文本的最佳方法? - The best way to extract text from common documents formats (primarily rtf, doc, docx, pdf, epub, mobi) that works with UWP? 在将Office文档转换为PDF时需要输入 - Input Desired on Converting Office Documents to PDF
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM