[英]Access to office 2003 files
I want to access to office 2003 files (.doc, .xls and.ppt) in order to extract text and some metadata (number of words, number of sheets, pictures, template, etc.).我想访问 Office 2003 文件(.doc、.xls 和 .ppt)以提取文本和一些元数据(字数、工作表数、图片、模板等)。 I'm able to do it with Open XML SDK for office 2007 documents.
对于 Office 2007 文档,我可以使用 Open XML SDK 来完成。 However, this extracton will take place on a server, which can't have apps like Microsoft Office installed (that's the reason why I can't use Office's Interop).
但是,此提取将在服务器上进行,该服务器不能安装 Microsoft Office 等应用程序(这就是我不能使用 Office 的 Interop 的原因)。 I have tried NPOI, however actually it only supports.xls files.
我试过 NPOI,但实际上它只支持 .xls 文件。 The other libraries that I found are not open-source, I can't use it on my work... I downloaded NPOI Scratchpad but the code is very "raw", I can't use it on my work.
我找到的其他库不是开源的,我不能在我的工作中使用它...我下载了 NPOI Scratchpad 但代码非常“原始”,我不能在我的工作中使用它。 Do you have any other idea to get the text and metadata from office 2003 documents?
您是否还有其他想法可以从 Office 2003 文档中获取文本和元数据? I'm not a very experienced programmer, and I'm using C# (However, if there is any solution to this problem in C++ I could consider to use it).
我不是一个非常有经验的程序员,我正在使用 C# (但是,如果 C++ 中有任何解决此问题的方法,我可以考虑使用它)。 Thanks.
谢谢。
There are many libraries like:有很多库,例如:
I don't know any free libraries supporting office 2003 format.我不知道任何支持 Office 2003 格式的免费图书馆。
good luck祝你好运
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.