简体   繁体   English

使用 SAX 方法和 Open XML SDK 读取大型 Excel 文档的最可靠方法

[英]Most reliable method to read large Excel documents using SAX approach with Open XML SDK

I have two methods of parsing a large Excel document using the SAX approach with the Open XML SDK.我有两种使用 SAX 方法和 Open XML SDK 解析大型 Excel 文档的方法。

Between the two examples below, which will be most reliable for larger files?在下面的两个示例中,哪个对于较大的文件最可靠?

1. Load current cell using Reader object and iterate checking ReadNextSibling() is true 1. 使用 Reader object 加载当前单元格并迭代检查 ReadNextSibling() 是否为真

using (SpreadsheetDocument spreadsheetDocument = SpreadsheetDocument.Open(filename, false))
{
    WorkbookPart workbookPart = spreadsheetDocument.WorkbookPart;
    Sheet sheet = workbookPart.Workbook.Descendants<Sheet>().Where(s => s.Name == "Sheet1").FirstOrDefault();
    WorksheetPart worksheetPart = (WorksheetPart)workbookPart.GetPartById(sheet.Id);
    
    OpenXmlReader reader = OpenXmlReader.Create(worksheetPart);

    while (reader.Read())
    {
        if (reader.ElementType == typeof(Row))
        {
            reader.ReadFirstChild();
            do
            {
                if (reader.ElementType == typeof(Cell))
                {
                    Cell cell = (Cell)reader.LoadCurrentElement();
                }
            } while (reader.ReadNextSibling());
        }
    }
}

2. Load current Row using reader object and iterate over child Elements 2. 使用阅读器 object 加载当前行并遍历子元素

using (SpreadsheetDocument spreadsheetDocument = SpreadsheetDocument.Open(filename, false))
{
    WorkbookPart workbookPart = spreadsheetDocument.WorkbookPart;
    Sheet sheet = workbookPart.Workbook.Descendants<Sheet>().Where(s => s.Name == "Sheet1").FirstOrDefault();
    WorksheetPart worksheetPart = (WorksheetPart)workbookPart.GetPartById(sheet.Id);
    
    OpenXmlReader reader = OpenXmlReader.Create(worksheetPart);

    while (reader.Read())
    {
        if (reader.ElementType == typeof(Row))
        {
            Row row = (Row)reader.LoadCurrentElement();

            foreach (OpenXmlElement element in row.ChildElements)
            {
                Cell cell = (Cell)element;
            }
        }
    }
}

The 1st example is using the reader each time to load the current Cell element.第一个例子是每次都使用阅读器来加载当前的 Cell 元素。

The 2nd example is a hybrid approach because the Row is in memory while being parsed and the cell is in the collection of child elements ready to be parsed.第二个示例是一种混合方法,因为行在被解析时位于 memory 中,而单元格位于准备解析的子元素集合中。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM