简体   繁体   中英

Write to the large docX without loading it in memory file using C# and openXML SDK

I have a large word docx file (more than 100 MB), and it contains a table, there is a requirement to add additional data in this table I am using the following approach

using (WordprocessingDocument wordDocument = WordprocessingDocument.Open(filepath, true))
{
    var table = wordDocument.MainDocumentPart.Document.Body.Elements<Table>().First();
    foreach (var row in data)
    {
        var tRow = new TableRow();

        foreach (var cell in row)
        {
            var hCell = new TableCell();
            var cPar = new Paragraph(new Run(new Text(cell)));
            hCell.Append(cPar);
            tRow.Append(hCell);
        }

        table.Append(tRow);
    }
}

And it seems that the whole document has been loaded in memory. Is there any way to write to the file without loading the whole DOM structure, using for example SAX approach?

I've tried to do the same many times, but so far my conclusion is that, no, you can not.

I've tried many approaches, even using 3rd party tools like Aspose, but so far they've all required loading the full document into memory.

Which makes sense, considering you have to find the entry point, and how can you do that without evaluating the content, which requires loading the content?

So far, the method with the least extra overhead (yet using the most memory) is to use VSTO and so it through a Word AddIn, when the document is open anyway. It gives the least extra overhead because the document is open anyway (the user has opened it in Word) - but this approach won't be useful if you need to do this without user action, or it has to be done serverside without launching a full Word application.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM