简体繁体 English

在JVM中增加驻留内存大小是否表明内存泄漏？

[英]Does and increasing resident memory size in the JVM indicate a memory leak?

原文 2011-10-04 21:05:05 9 3 java/ memory-management/ jboss/ garbage-collection

I am starting up a JBoss 4.2 server instance with the following command-line options: 我正在使用以下命令行选项启动JBoss 4.2服务器实例：

-Xms8192m -Xmx8192m -XX:+DisableExplicitGC -XX:MaxPermSize=512m -Xms8192m -Xmx8192m -XX：+ DisableExplicitGC -XX：MaxPermSize = 512m

I have NOT received an OutOfMemoryException but I would not expect the memory usage to increase if things are being GC'd as their references die. 我没有收到OutOfMemoryException，但是如果因为他们的引用死亡，那么我不希望内存使用量增加。 The resident memory usage (as measured with top) starts at ~ 4.2G and steadily increases over the next few days to a week until it hits 8.4G. 常驻内存使用量（以顶部测量）从~4.2G开始，并在接下来的几天内持续增加到一周，直到达到8.4G。 I still do not receive an Exception. 我仍然没有收到异常。 My concern is that during periods of significant activity (JBoss Messaging processing > 10k mssgs/sec) there are processing lags of ~100-700ms that occur every 6-10 seconds. 我担心的是，在重要活动期间（JBoss Messaging处理> 10k mssgs / sec），每隔6-10秒就会出现约100-700ms的处理滞后。 This also seems to correlate with a rise in resident memory usage. 这似乎与驻留内存使用量的增加有关。 This is occurring on a machine with 2 quad-core processors and 32G of memory. 这是在具有2个四核处理器和32G内存的机器上进行的。

I will be turning on the additional command-line parameters: 我将打开其他命令行参数：

-verbosegc -XX:+PrintGCDetails -verbosegc -XX：+ PrintGCDetails

But, I would like to know if this seems to be a garbage collection issue, or a memory leak? 但是，我想知道这似乎是垃圾收集问题，还是内存泄漏？ I have tried to track down a potential memory leak for a couple weeks and have found things with the XML processing that I thought would definitely fix it, but that was a dead-end. 我试图追踪潜在的内存泄漏几周，并找到了我认为肯定会修复它的XML处理的东西，但这是一个死胡同。 Does the resident memory usage climb to meet the value of Xmx regardless of whether there is a leak or not (especially with explicit GC disabled)? 无论是否存在泄漏，常驻内存使用量是否会升高以满足Xmx的值（特别是在禁用显式GC的情况下）？ Are there some other garbage collection parameters that may help if it is indeed not a leak (such as modifying garbage collector types, survivor ratios, or pause targets)? 是否有一些其他垃圾收集参数可能有帮助，如果它确实不是泄漏（如修改垃圾收集器类型，幸存者比率或暂停目标）？

I know 100-700ms delays do not seem like much but it has the potential to make a significant difference in this application. 我知道100-700ms的延迟似乎并不多，但它有可能在这个应用程序中产生重大影响。 Thanks in advance for any help/suggestions you can offer. 提前感谢您提供的任何帮助/建议。

3 个解决方案

The definition of a "leak" in Java differs substantially from the definition of a "leak" in, say, C/C++. Java中“泄漏”的定义与C / C ++中“泄漏”的定义大不相同。 In C/C++ you get a "leak" when you malloc or new a piece of storage and never subsequently free or delete it. 在C / C ++中，当您使用malloc或new的存储空间时，您会收到“泄漏”，并且从未随后free或delete它。

But in Java, of course, you never delete anything, but leave it up to GC to find stuff that is no longer referenced and free it up. 但是在Java中，当然，你永远不会删除任何内容，而是将它留给GC来查找不再引用的内容并将其释放。

What can happen, however, is that some complex data structure gets built up, and then built up some more, and then built up some more, sometimes unintentionally. 然而，可能发生的是，一些复杂的数据结构被构建，然后构建更多，然后建立更多，有时是无意的。

The most obvious case would be something like a StringBuffer that is used to accumulate log info and never written/emptied. 最明显的情况是类似于StringBuffer，用于累积日志信息而从不写入/清空。 But you can also have a (intentionally) long-lived structure in your application where you happen to "park" some (supposedly) short-lived object, but then fail to null the pointer to the "short-lived" object after you're done with it, causing it to effectively become immortal. 但是你也可以在你的应用程序中有一个（故意）长寿命的结构，你碰巧“停放”了一些（假设的）短命对象，但是在你之后null指向“短命”对象的指针完成它，使它有效地成为不朽的。 Some of these things are obvious, some require considerable investigation to figure out. 其中一些是显而易见的，有些需要大量的调查来弄清楚。

But in a large server you also tend to have a fairly steady build-up of "stuff" over a period of time, even without such leaks. 但是在一个大型服务器中，即使没有这样的泄漏，你也会在一段时间内相当稳定地积累“东西”。 Eg, if a given application is called on, it may cause some objects to be created (or just classes loaded, with objects they indirectly create), and those objects may "hang around" until the next time the application is called. 例如，如果调用给定的应用程序，它可能会导致创建一些对象（或者仅加载类，它们间接创建的对象），并且这些对象可能“挂起”直到下次调用应用程序。 Things like web page caches fill up. 像网页缓存这样的东西填满了。 If something like JSP is used, objects for that will be created and "cached" for later use. 如果使用类似JSP的东西，则会创建并“缓存”以供以后使用的对象。

But this build-up of "stuff" should observe an asymptotic behavior, slowly approaching some steady-state value over time. 但是这种“东西”的积累应该观察到渐近的行为，随着时间的推移慢慢接近一些稳态值。 If it continues upward at a steady state, then you probably DO have a "leak". 如果它在稳定状态下继续向上，那么你可能会有“泄漏”。

Re your GC behavior, it's not unusual to have GC running every few seconds on a busy server. 重新启动GC行为，在繁忙的服务器上每隔几秒运行一次GC并不罕见。 You can play with the tuning parameters to try to "balance" the different GC "tiers", but it's a bit of a black art to do so. 您可以使用调整参数来尝试“平衡”不同的GC“层”，但这样做有点黑色。 And often poor GC performance on a server simply a matter of having a GC implementation that isn't well-designed for servers -- GC on a busy server needs to be capable of running in a largely concurrent fashion, and most GC implementations do not do that very well. 并且通常在服务器上的GC性能不佳只是因为GC实现没有为服务器设计良好 - 繁忙的服务器上的GC需要能够以大部分并发的方式运行，并且大多数GC实现不会做得很好。

I have NOT received an OutOfMemoryException but I would not expect the memory usage to increase if things are being GC'd as their references die. 我没有收到OutOfMemoryException，但是如果因为他们的引用死亡，那么我不希望内存使用量增加。

This is not necessarily correct. 这不一定正确。 It's possible for objects to be placed in tenure memory of the heap. 对象可以放在堆的tenure内存中。 This occurs if the object is referenced for more than a "short time". 如果对象被引用的时间超过“短时间”，则会发生这种情况。 Once in tenure it normally won't be garbage collected util max heap is reached. 一旦进入任期，它通常不会被垃圾收集到达最大堆。 So just because the heap memory usage isn't decreasing doesn't mean you have a leak. 因为堆内存使用量没有减少并不意味着你有泄漏。

The normal memory profile (heap used vs time) looks like a sawtooth - memory usage slowly increases until some threshold (typically max heap), and then decreases because of a GC, then slowly increases again. 正常的内存配置文件（使用的堆与时间）看起来像一个锯齿 - 内存使用缓慢增加，直到某个阈值（通常是最大堆），然后由于GC而减少，然后再次缓慢增加。

Only thing I would add would be to try and connect a profiler to your application and see how the memory usage behaves. 我要添加的只是尝试将探查器连接到您的应用程序，并查看内存使用情况的行为。 I've successfully profiled instances of Weblogic AS with both jProfiler and YourKit, although I haven't tried it with JBoss, should be fairly easy. 我已经用jProfiler和YourKit成功地分析了Weblogic AS的实例，虽然我没有尝试过JBoss，但应该相当容易。

Can you replicate this behavior in a testing environment (since the profiler is kind of heavy on the performance, only do it in production environment if you're desperate)? 您是否可以在测试环境中复制此行为（因为分析器在性能上有点沉重，只有在绝望时才在生产环境中执行）？ If you do, you can see if the GC is only getting invoked when it reaches the XMX threshold (doesn't mean anything is wrong), and you can explicitly invoke a GC to see how it behaves. 如果这样做，您可以看到GC是否仅在达到XMX阈值时被调用（并不意味着任何错误），并且您可以显式调用GC以查看它的行为方式。 When you get a constant increase in memory even with GC calls, that might indicate a problem. 即使使用GC调用，如果内存不断增加，也可能表示存在问题。

A decent profiler can tell you what objects are growing more rapidly in number and such, which can help you out tremendously if you do indeed have a "leak". 一个不错的分析器可以告诉你哪些对象的数量增长得更快等等，如果确实有“泄漏”，这可以帮助你。