简体   繁体   中英

How To Read/Write/Modify Large Excel XLSB/XLSM Files? (C#)

I have a 500mb excel (.xlsb/.xlsm) file. I need a way to read/write/modify large (.xlsb/.xlsm) files using C# without loading the entire file in memory, but load it in chunks instead or at least load a single sheet at a time.

Excel files are essentially zip files containing XML files. If you open an excel file with any zip tool you will see the contents of the excel document. What you need to modify there is:

  • xl/sharedStrings.xml - excel optimizes string usage by indexing them in this file (not visually, but you can iterate them to the end and count - the first one has an index 0, the second one has an index 1 and so on...). Use these indices to change/add strings in the sheet files to not corrupt the document.

  • xl/workbook.xml - contains sheets' names. For example you can find that sheet1 is named "This Months' Income" in excel. Use that to find your sheet by name if you will.

  • xl/worksheets/*.xml - here are your actual sheets. To change/add a string use the shared strings XML file. To change/add numbers do it directly. Cells that contain a shared string value are marked as such.

Now you just have to parse/edit these XML files while reading them line by line and not loading the entire files in memory and you will be able to process huge amounts of data with very little memory footprint.

In C# I use ZipArchive to temporary extract only the files I need, edit them and then update the zip. Do not extract everything and then zip it again because you will corrupt the file. At least I don't know how to zip it In such a way that will make it usable again.

真正快速读取和写入 xlsb/xlsx 可以通过https://github.com/KrzysztofDusko/SpreadSheetTasks完成。

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM