简体   繁体   English

读取大型XLSX文件

[英]Reading large XLSX files

I have an application that have to read excel and convert it to array. 我有一个必须阅读excel并将其转换为数组的应用程序。 So far so good. 到现在为止还挺好。 Everything works file until I try to convert a larger file. 一切正常,直到我尝试转换更大的文件为止。 I try OpenXML and try SAX approach: 我尝试使用OpenXML并尝试使用SAX方法:

using (SpreadsheetDocument xlsx = SpreadsheetDocument.Open(filePath, false))
{
   WorkbookPart workbookPart = xlsx.WorkbookPart;
   List<List<string>> parsedContent = new List<List<string>>();
   foreach (WorksheetPart worksheet in workbookPart.WorksheetParts)
       {
           OpenXmlReader xlsxReader = OpenXmlReader.Create(worksheet);

           while (xlsxReader.Read())
           {
           }
        }
 }

This is working well for files in range 1 - 10MB. 这对于1-10MB范围内的文件效果很好。 My problem is when I try to load 10+ MB file. 我的问题是当我尝试加载10 MB以上的文件时。 The result is OutOfMemoryException. 结果是OutOfMemoryException。 How to proper read that big chunk of data? 如何正确读取大数据? How to do it memory efficient? 如何做到内存高效?

Ps I try libraries like ClosedXML, EPPlus and few others. 附言:我尝试像ClosedXML,EPPlus等库。

Every solution will be appreciated. 每个解决方案将不胜感激。 Thank you in advance 先感谢您

If you plan on only performing a read on the excel file content, I suggest you use the ExcelDataReader library instead Link , which extracts the worksheetData into a DataSet object. 如果计划仅对excel文件内容执行读取,则建议您使用ExcelDataReader库而不是Link ,该库将工作表数据提取到DataSet对象中。

        IExcelDataReader reader = null;
        string FilePath = "PathToExcelFile";

        //Load file into a stream
        FileStream stream = File.Open(FilePath, FileMode.Open, FileAccess.Read);

        //Must check file extension to adjust the reader to the excel file type
        if (Path.GetExtension(FilePath).Equals(".xls"))
            reader = ExcelReaderFactory.CreateBinaryReader(stream);
        else if (Path.GetExtension(FilePath).Equals(".xlsx"))
            reader = ExcelReaderFactory.CreateOpenXmlReader(stream);

        if (reader != null)
        {
            //Fill DataSet
            DataSet content = reader.AsDataSet();
            //Read....
        }

Use ExcelDataReader . 使用ExcelDataReader It is easy to install through Nuget and should only require a few lines of code: 通过Nuget可以很容易地安装它,并且只需要几行代码:

Nuget: 的NuGet:

Install-Package ExcelDataReader

Usage: 用法:

 using (FileStream stream = File.Open(filePath, FileMode.Open, FileAccess.Read))
    {
        using (IExcelDataReader excelReader = ExcelReaderFactory.CreateOpenXmlReader(stream))
        {
            DataSet result = excelReader.AsDataSet();
            foreach (DataRow dr in result[0])
            {
                //Do stuff
            }
        }
    }

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM