简体   繁体   English

使用 C# 和 openXML SDK 写入大型 docX 而不将其加载到内存文件中

[英]Write to the large docX without loading it in memory file using C# and openXML SDK

I have a large word docx file (more than 100 MB), and it contains a table, there is a requirement to add additional data in this table I am using the following approach我有一个很大的word docx文件(超过100 MB),它包含一个表格,需要在这个表格中添加额外的数据我使用以下方法

using (WordprocessingDocument wordDocument = WordprocessingDocument.Open(filepath, true))
{
    var table = wordDocument.MainDocumentPart.Document.Body.Elements<Table>().First();
    foreach (var row in data)
    {
        var tRow = new TableRow();

        foreach (var cell in row)
        {
            var hCell = new TableCell();
            var cPar = new Paragraph(new Run(new Text(cell)));
            hCell.Append(cPar);
            tRow.Append(hCell);
        }

        table.Append(tRow);
    }
}

And it seems that the whole document has been loaded in memory.似乎整个文档都已加载到内存中。 Is there any way to write to the file without loading the whole DOM structure, using for example SAX approach?有没有办法在不加载整个 DOM 结构的情况下写入文件,例如使用 SAX 方法?

I've tried to do the same many times, but so far my conclusion is that, no, you can not.我已经多次尝试这样做,但到目前为止我的结论是,不,你不能。

I've tried many approaches, even using 3rd party tools like Aspose, but so far they've all required loading the full document into memory.我尝试了很多方法,甚至使用像 Aspose 这样的 3rd 方工具,但到目前为止,它们都需要将完整文档加载到内存中。

Which makes sense, considering you have to find the entry point, and how can you do that without evaluating the content, which requires loading the content?这是有道理的,考虑到您必须找到入口点,并且如何在不评估内容的情况下做到这一点,这需要加载内容?

So far, the method with the least extra overhead (yet using the most memory) is to use VSTO and so it through a Word AddIn, when the document is open anyway.到目前为止,额外开销最少(但使用最多内存)的方法是使用 VSTO,因此在文档打开时通过 Word AddIn 使用它。 It gives the least extra overhead because the document is open anyway (the user has opened it in Word) - but this approach won't be useful if you need to do this without user action, or it has to be done serverside without launching a full Word application.它提供了最少的额外开销,因为文档无论如何都是打开的(用户已在 Word 中打开它) - 但是如果您需要在没有用户操作的情况下执行此操作,或者必须在服务器端完成而不启动完整的 Word 应用程序。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM