简体   繁体   English

具有内存不足异常的EPPlus大数据集问题

[英]EPPlus Large Dataset Issue with Out of Memory Exception

System Out of Memory Exception. 系统内存异常。 I see the memory Stream is only flushed when saved. 我看到内存Stream只在保存时刷新。 We have 1.5 - 2GB Datasets. 我们有1.5 - 2GB的数据集。

I am using EPPlus Version 3.1.3.0 我使用的是EPPlus版本3.1.3.0

We do the following in code. 我们在代码中执行以下操作。

We loop through 我们循环

     --> Create a Package
        --> each table in the datareader
            -->   Add WorkSheet to the Package 
        --> Dispose Each table.
     --> Save the  Package.

Each Datatable is of a 300Mg Size up to 15 Tables out form the System. 每个数据表的大小为300Mg,最多15个表格。

This is causing a issue, I have logged this in detail @ https://epplus.codeplex.com/workitem/15085 这导致了一个问题,我已经详细记录了这个@ https://epplus.codeplex.com/workitem/15085

I still want to be able to use EPPlus its very nice API. 我仍然希望能够使用EPPlus非常好的API。 but is there a better way to free up a worksheet once we add it to the package. 但是,一旦我们将工作表添加到包中,就有更好的方法来释放工作表。

Thank you for helping. 谢谢你的帮忙。

Unfortunately, this seems to be a major limitation of EPPlus - you can find others posting about it on their codeplex page. 不幸的是,这似乎是EPPlus的一个主要限制 - 您可以在其codeplex页面上找到其他人发布的内容。 I ran into a similar issue when exporting large dataset - single tables with 115+ columns wide and 60K+ rows tall. 我在导出大型数据集时遇到了类似的问题 - 单个表格宽115+列,高60K +行。 Typically around 30 to 35k rows is when it ran out of memory. 通常,当内存不足时,大约30到35k行。 What is happening is every cell that is created is it own object which is fine for small dataset but in my case it would be 115x60K= ~7 million. 发生了什么是创建的每个单元格都是它自己的对象,对于小数据集来说很好,但在我的情况下它将是115x60K = ~700万。 Since each cell is an object with content (mostly strings) its memory footprint adds up quick. 由于每个单元格都是一个包含内容(主要是字符串)的对象,因此其内存占用量会快速增加

At some point in the future my plan was to create the XML files manually using Linq2Xml. 在将来的某个时候,我的计划是使用Linq2Xml手动创建XML文件。 An xlsx is just a zip file renamed with XML files making up the content of the workbook and worksheets. xlsx只是一个用XML文件重命名的zip文件,它构成了工作簿和工作表的内容。 So, you could create an empty xlsx using EPP, save it, open it as a zip, pull out sheet1.xml and add the data content via string manipulation. 因此,你可以使用EPP创建一个空的xlsx,保存它,打开它作为zip,拉出sheet1.xml并通过字符串操作添加数据内容。 You would also have to work on the sharedstring.xml file which Excel uses to help keep the file size down. 您还必须处理Excel用于帮助保持文件大小的sharedstring.xml文件。 There are probably other xml files that will need updating as well with keys or name. 可能还有其他xml文件需要更新以及密钥或名称。

If you rename any xlxs to a .zip extension you can see this. 如果您将任何xlxs重命名为.zip扩展名,您可以看到这一点。

Example sheet1.xml: 示例sheet1.xml:

简单的Excel文件示例

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<worksheet xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships" xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006" mc:Ignorable="x14ac" xmlns:x14ac="http://schemas.microsoft.com/office/spreadsheetml/2009/9/ac">
    <dimension ref="A1:C2"/>
    <sheetViews>
        <sheetView tabSelected="1" workbookViewId="0">
            <selection activeCell="C5" sqref="C5"/>
        </sheetView>
    </sheetViews>
    <sheetFormatPr defaultRowHeight="15" x14ac:dyDescent="0.25"/>
    <sheetData>
        <row r="1" spans="1:3" x14ac:dyDescent="0.25">
            <c r="A1" t="s">
                <v>0</v>
            </c><c r="B1" t="s">
                <v>1</v>
            </c><c r="C1" t="s">
                <v>0</v>
            </c>
        </row>
        <row r="2" spans="1:3" x14ac:dyDescent="0.25">
            <c r="A2" t="s">
                <v>1</v>
            </c><c r="B2" t="s">
                <v>0</v>
            </c><c r="C2" t="s">
                <v>1</v>
            </c>
        </row>
    </sheetData>
    <pageMargins left="0.7" right="0.7" top="0.75" bottom="0.75" header="0.3" footer="0.3"/>
</worksheet>

Example sharedstrings.xml: 示例sharedstrings.xml:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<sst xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" count="6" uniqueCount="2">
    <si>
        <t>AA</t>
    </si>
    <si>
        <t>BB</t>
    </si>
</sst>

You can see how I did xml manipulation in my other post: 您可以在我的其他帖子中看到我如何进行xml操作:

Create Pivot Table Filters With EPPLUS 使用EPPLUS创建数据透视表过滤器

Sorry I couldnt give you a better answer but hopefully that helps. 对不起,我无法给你一个更好的答案,但希望这有帮助。

I had this problem, but I fixed it by switching the option of " Platform target ", from x86 to x64 or " Any CPU ". 我有这个问题,但我通过切换“ Platform target ”选项,从x86x64或“ Any CPU ”来修复它。 ( right click on the project, then select "Properties", then the tab "Build", then on "Platform target" select "x64" ) (右键单击项目,然后选择“属性”,然后选择“Build”选项卡,再选择“Platform target”选择“x64”)

The problem is that for platform x86 you can use only about 1.8 GB of RAM. 问题是对于平台x86您只能使用大约1.8 GB的RAM。 For platform x64 , you do not have this limitation. 对于平台x64 ,您没有此限制。

@Ernie is correct about some of the limitations of the current version of EPPlus. @Ernie对于当前EPPlus版本的一些限制是正确的。 They've acknowledged that, and have been working on fixing it. 他们已经承认这一点,并一直在努力修复它。 This leaves you with one of two possible options for getting this to work: 这为您提供了两种可能的选项之一:

1) Switch to the EPPlus 4.0 Beta, where they've fixed this issue, along with some other things as well (although you'll be using a beta version). 1)切换到EPPlus 4.0 Beta,他们已经解决了这个问题以及其他一些问题(尽管你将使用测试版)。

2) The ExcelPackage and ExcelWorksheet classes both implement IDisposable , so you might start getting better performance if you were to wrap your usage of them in a using() statement. 2) ExcelPackageExcelWorksheet类都实现了IDisposable ,因此如果要在using()语句中包含它们的using() ,可能会开始获得更好的性能。

Pay attention if you are passing streams to the ExcelPackage. 如果要将流传递给ExcelPackage,请注意。 In my case I had a windows-service, loading a Packages using a memorystream. 在我的情况下,我有一个Windows服务,使用内存流加载包。 Now the service crashed after some times with an OutOfMemory exception. 现在服务在一些OutOfMemory异常后崩溃了。

Reason: The dispose of the ExcelPackage does not dispose the stream! 原因:处理ExcelPackage不会丢弃流!

Solution: 解:

using (MemoryStream ms = new MemoryStream(Convert.FromBase64String(excelSheetBase64)))
using (ExcelPackage excelPackage = new ExcelPackage(ms))
{
    // Your code
}

The problem is sometimes present during debugging of large amounts of data. 在调试大量数据时有时会出现此问题。

If you try the application in the server in true IIS or in your PC in tru IIS if you have the Win PRO edition 如果你有Win PRO版本,如果你在真正的IIS中的服务器上尝试应用程序,或者在tru IIS中使用你的PC

the problem on OutOFMemoryException does not occur. OutOFMemoryException上的问题不会发生。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM