简体   繁体   English

堆空间不足而导致内存不足错误-如何使用Java堆内存

[英]getting heap space out of memory error- how is java heap memory used

I am reading a single XML file of size- 2.6GB-- the size of JVM is 6GB. 我正在读取一个大小为2.6GB的XML文件-JVM的大小为6GB。

However I am still getting a Heap Space out of memory error? 但是我仍然遇到堆空间不足的错误吗?

What am I doing wrong here... 我在这里做错了什么...

For reference, I output the max memory and free memory properties of the JVM-- 作为参考,我输出了JVM的最大内存和空闲内存属性-

The max memory was shown as approx 5.6GB, but free memory was shown as only 90MB... Why is only 90MB being shown as free, esp. 最大内存显示为大约5.6GB,但是可用内存显示为仅90MB ...为什么只有90MB显示为空闲,尤其是。 when I have not even started any processing... I have just started the program? 当我什至没有开始任何处理时...我刚刚启动了程序?

In general, when converting structured text to the corresponding data structures in Java you need a lot more space than the size of the input file. 在一般情况下,在Java中结构化文本转换为相应的数据结构时,你需要比输入文件的大小更多空间。 There is a lot of overhead associated with the various data structures that are used, apart from the space required for the strings. 除了字符串所需的空间之外,与使用的各种数据结构相关的开销也很多。

For example, each String instance has an additional overhead of about 32-40 bytes - not to mention that each character is stored in two bytes, which effectively doubles the space requirements for ASCII-encoded XML. 例如,每个String实例都有大约32-40字节的额外开销-更不用说每个字符存储在两个字节中了,这实际上使ASCII编码XML的空间需求增加了一倍。

Then you have additional overhead when storing the String in a structure. 这样,在将String存储在结构中时,就会产生额外的开销。 For example, in order to store a String instance in a Map you will need about 16-32 bytes of additional overhead, depending on the implementation and how you measure the usage. 例如,为了将String实例存储在Map您将需要大约16-32字节的额外开销,具体取决于实现方式和衡量使用情况的方式。

It is quite possible that 6GB is just not enough to store a parsed 2.6GB XML file at once... 6GB可能不足以立即存储已解析的2.6GB XML文件...

Bottom line: 底线:

If you are loading such a large XML file in memory (eg using a DOM parser) you are probably doing something wrong. 如果要在内存中加载如此大的XML文件(例如,使用DOM解析器),则可能是做错了什么。 A stream-based parser such as SAX should have far more modest requirements. 基于流的解析器(例如SAX)应具有更为适度的要求。

Alternatively consider transforming the XML file into a more usable file format, such as an embedded database - or even an actual server-based database. 或者,考虑将XML文件转换为更有用的文件格式,例如嵌入式数据库-甚至是基于服务器的实际数据库。 That would allow you to process far larger documents without issues. 这样一来,您就可以处理更大的文档而不会出现问题。

您应该避免将整个xml一次加载到内存中,而应使用可以处理大量xml的专门类。

There are potentially several different issues here. 这里可能存在几个不同的问题。

But for starters: 但对于初学者:

1) If you're on a 64-bit OS, make sure you're using a 64-bit JVM 1)如果您使用的是64位操作系统,请确保使用的是64位JVM

2) Make sure your code closes all resources you open as promptly as possible. 2)确保您的代码尽快关闭所有打开的资源。

3) Explicitly set references to large objects you're done with to "null". 3)明确将对大型对象的引用设置为“ null”。

... AND ... ...还有...

4) Familiarize yourself with JConsole or VisualVM : 4)熟悉JConsoleVisualVM

You can't load a 2.6 GB XML image as a document with just 6 GB. 您只能将6 GB的2.6 GB XML图像加载为文档。 As jhordo suggests, the ratio is more likely to be be 12 to 1. This is because every byte turns into a 16-bit character and every tag, attribute and value turns into a String with at least 32 bytes of overhead. 正如jhordo所建议的,该比率更可能是12比1。这是因为每个字节都变成一个16位字符,并且每个标签,属性和值都变成一个带有至少32个字节开销的字符串。

Instead what you should do is use a SAX or event based parser to process the file progressively. 相反,您应该做的是使用SAX或基于事件的解析器来逐步处理文件。 This way it will only keep as much data as you need to retain. 这样,它将仅保留您需要保留的尽可能多的数据。 If you can process everything in one pass, you won't need to retain anything. 如果您可以一次性处理所有内容,则无需保留任何内容。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM