简体   繁体   English

64位JVM限制为300GB内存?

[英]64-bit JVM limited to 300GB of memory?

I am attempting to run a Java application on a cluster computing environment (IBM LSF running CentOS release 6.2 Final) that can provide me with up to 1TB of RAM space. 我试图在集群计算环境(运行CentOS版本6.2 Final的IBM LSF)上运行Java应用程序,它可以为我提供高达1TB的RAM空间。

I could create a JVM with up to 300GB of maximum memory (Xmx), although I need more than that (I can provide details, if requested). 我可以创建一个具有高达300GB最大内存(Xmx)的JVM,虽然我需要更多内容(如果需要,我可以提供详细信息)。

However, it seems to be impossible to create a JVM with more than 300GB of maximum memory using the Xmx option. 但是,使用Xmx选项创建具有超过300GB最大内存的JVM似乎是不可能的。 To be more specific, I get the classic error message: 更具体地说,我收到了经典的错误消息:

Error occurred during initialization of VM. VM初始化期间发生错误。

Could not reserve enough space for object heap. 无法为对象堆保留足够的空间。

The details of my (64-bit) JVM are below: 我的(64位)JVM的详细信息如下:

OpenJDK Runtime Environment (IcedTea6 1.10.6) (rhel-1.43.1.10.6.el6_2-x86_64) OpenJDK运行时环境(IcedTea6 1.10.6)(rhel-1.43.1.10.6.el6_2-x86_64)

OpenJDK 64-Bit Server VM (build 20.0-b11, mixed mode) OpenJDK 64位服务器VM(内置20.0-b11,混合模式)

I've also tried with a Java 7 64-bit JVM but I've had exactly the same problem. 我也尝试过使用Java 7 64位JVM,但我遇到了完全相同的问题。

Moreover, I tried to create a JVM to run a HelloWorld.jar, but still JVM creation fails if you ask for more than -Xmx300G, so I don't think it has anything to do with the specific application. 此外,我尝试创建一个JVM来运行HelloWorld.jar,但是如果你要求超过-Xmx300G,JVM创建仍然会失败,所以我认为它与特定应用程序没有任何关系。


Does anyone have any idea why I cannot create a JVM with more than 300G of max memory? 有谁知道为什么我不能创建超过300G的最大内存的JVM?

Can anyone please suggest a solution/workaround? 任何人都可以建议解决方案/解决方法吗?

I can think of a couple of possible explanations: 我可以想到几个可能的解释:

  • Other applications on your system are using so much memory that there isn't 300Gb available right now . 您系统上的其他应用程序使用了大量内存,目前还没有300Gb 可用

  • There could be a resource limit on the per-process memory size. 每个进程的内存大小可能存在资源限制。 You can check this using ulimit . 您可以使用ulimit进行检查。 (Note that according to this bug , you will get the error message if the per-process resource limit stops the JVM allocating the heap regions.) (请注意,根据此错误 ,如果每个进程资源限制停止JVM分配堆区域,您将收到错误消息。)

  • It is also possible that this is an "over commit" issue; 这也可能是“过度提交”问题; eg if your application is running in a virtual and the system as a whole cannot meet the demand because there is too much competition from other virtuals. 例如,如果您的应用程序在虚拟环境中运行,并且整个系统无法满足需求,因为来自其他虚拟机的竞争太多。


A couple of the other ideas suggested are (IMO) unlikely: 建议的其他一些想法(IMO)不太可能:

  • Switching the JRE is unlikely to make any difference. 切换JRE不太可能有任何区别。 I've never heard or seen of arbitrary memory limits in specific 64 bit JVMs. 我从未在特定的64位JVM中听到或看到任意内存限制。

  • It is unlikely to be due to not having enough contiguous memory. 它不太可能是由于没有足够的连续内存。 Certainly contiguous physical memory is not required. 当然不需要连续的物理内存。 The only possibility might be contiguous space on the swap device, but I don't recall that being an issue for typical Linux OSes. 唯一的可能是交换设备上的连续空间,但我不记得这是典型Linux操作系统的问题。


Can anyone please suggest a solution/workaround? 任何人都可以建议解决方案/解决方法吗?

  • Check the ulimit . 检查ulimit

  • Write a tiny C program that attempts to malloc lots of memory and see how much that can allocate before it fails. 编写一个小型的C程序,尝试对大量内存进行malloc ,并查看在失败之前可以分配多少内存。

  • Ask the system (or hypervisor) administrator for help. 向系统(或管理程序)管理员寻求帮助。

(edited, see added section on swap space) (已编辑,请参阅有关交换空间的添加部分)

SHMMAX and SHMALL SHMMAX和SHMALL

Since you are using CentOS, you may have run into a similar issue about the SHMMAX and SHMALL kernel setting as described here for configuring the Oracle DB . 由于您使用的是CentOS,因此您可能遇到类似于SHMMAXSHMALL内核设置的问题,如此处所述,用于配置Oracle DB Under that same link is an example calculation for getting and setting the correct SHMALL setting. 在同一链接下是获取和设置正确SHMALL设置的示例计算。

Contiguous memory 连续的记忆

Certain users have already reported that not enough contiguous memory is available, others have said it is irrelevant. 某些用户已经报告说没有足够的连续内存,其他用户表示这是无关紧要的。

I am not certain whether the JVM on CentOS requires a contiguous block of memory. 我不确定CentOS上的JVM是否需要连续的内存块。 According to SAS , fragmented memory can prevent your JVM to startup with a large max Xmx or start Xms memory setting, but other claims on the internet say it doesn't matter. 根据SAS的说法 ,碎片化内存可能会阻止您的JVM使用大型Xmx启动或启动Xms内存设置,但互联网上的其他声明称无关紧要。 I tried to proof or unproof that claim on my 48GB Windows workstation, but managed to start the JVM with an initial and max setting of 40GB. 我尝试在我的48GB Windows工作站上证明或取消该声明,但设法以初始和最大设置40GB启动JVM。 I am pretty sure that no contiguous block of that size was available, but JVMs on different OS's may behave differently, because the memory management can be different per OS (ie, Windows typically hides the physical addresses for individual processes). 我很确定没有这种大小的连续块可用,但是不同操作系统上的JVM可能表现不同,因为每个操作系统的内存管理可能不同(即,Windows通常会隐藏单个进程的物理地址)。

Finding the largest contiguous memory block 寻找最大的连续内存块

Use /proc/meminfo to find the largest contiguous memory block available, see the value under VmAllocChunk . 使用/proc/meminfo查找可用的最大连续内存块,请参阅VmAllocChunk下的值。 Here's a guide and explanation of all values. 是所有价值观的指南和解释 If the value you see there is smaller than 300GB, try a value that falls right under the value of VmAllocChunk . 如果您看到的值小于300GB,请尝试一个低于VmAllocChunk值的值。

However, usually this number is higher than the physically available memory (because it is the virtual memory value available), it may give you a false positive. 但是,通常这个数字高于物理可用内存(因为它是可用的虚拟内存值),它可能会给你误报。 It is the value you can reserve, but once you start using it, it may require swapping. 这是您可以保留的值,但一旦开始使用它,可能需要交换。 You should therefore also check the MemFree and the Inactive values. 因此,您还应检查MemFreeInactive值。 Conversely, you can also look at the whole list and see what values do not surpass 300GB. 相反,您还可以查看整个列表,看看哪些值不超过300GB。

Other tuning options you can check for 64 bit JVM 您可以检查64位JVM的其他调整选项

I am not sure why you seem to hit a memory limit issue at 300GB. 我不知道为什么你似乎遇到300GB的内存限制问题。 For a moment I thought you might have hit a maximum of pages. 有那么一刻,我以为你可能会打到最多的页面。 With the default of 4kB, 300GB gives 78,643,200 pages. 默认值为4kB,300GB提供78,643,200页。 Doesn't look like some well-known magical number. 看起来不像一些众所周知的神奇数字。 If, for instance, 2^24 is the maximum, then 16,777,216 pages, or 64GB should be your theoretical allocatable maximum. 例如,如果2^24是最大值,那么16,777,216页或64GB应该是理论上可分配的最大值。

However, suppose for the sake of argument that you need larger pages (which is, as it turns out, better for performance of large memory Java applications), you should consult this manpage on JBoss , which explains how to use -XX:+UseLargePages and set kernel.shmmax (there it is again), vm.nr_hugepages and vm.huge_tlb_shm_group (not sure the latter is required). 但是,假设为了论证你需要更大的页面(事实证明,对于大型内存Java应用程序的性能更好),你应该参考JBoss上的这个手册页 ,它解释了如何使用-XX:+UseLargePages并设置kernel.shmmax (再次出现), vm.nr_hugepagesvm.huge_tlb_shm_group (不确定后者是否需要)。

Stress your system 强调你的系统

Others have suggested this already as well. 其他人已经提出过这个建议。 To find out that the problem lies with the JVM and not with the OS, you should stresstest it. 要找出问题在于JVM而不是操作系统,你应该对它进行压力测试。 One tool you could use is Stresslinux . 您可以使用的一个工具是Stresslinux In this tutorial , you find some options you can use. 在本教程中 ,您将找到可以使用的一些选项。 Of particular interest to you is the following command: 您特别感兴趣的是以下命令:

stress --vm 2 --vm-bytes 300G --timeout 30s --verbose

If that command fails, or locks your system, you know that the OS is limiting the use of that amount of memory. 如果该命令失败或锁定了您的系统,您就会知道操作系统正在限制使用该内存量。 If it succeeds, we should try to tweak the JVM such that it can use the available memory. 如果成功,我们应该尝试调整JVM,以便它可以使用可用内存。

EDIT Apr6: check swap space 编辑Apr6:检查交换空间

It is not uncommon that systems with very large internal memory sizes, use little or no swap space. 具有非常大的内部存储器大小的系统,使用很少或没有交换空间的情况并不少见。 For many applications this may not be a problem, but the JVM requires the swap available swap space to be larger than the requested memory size. 对于许多应用程序,这可能不是问题,但JVM要求交换可用交换空间大于请求的内存大小。 According to this bug report , the JVM will try to increase the swap space itself, however, as some answers in this SO thread suggested , the JVM may not always be capable of doing so. 根据这个错误报告 ,JVM将尝试增加交换空间本身,但是,正如这个SO线程中的一些答案所暗示的那样 ,JVM可能并不总是能够这样做。

Hence: check the currently available swap space with cat /proc/swaps # free and, if it is smaller than 300GB, follow the instructions on this CentOS manpage to increase the swap space for your system. 因此:使用cat /proc/swaps # free检查当前可用的交换空间,如果小于300GB,请按照此CentOS联机帮助页上的说明增加系统的交换空间。

Note 1: we can deduct from bugreport #4719001 that a contiguous block of available swap space is not a necessity. 注1:我们可以从bugreport#4719001中扣除一个连续的可用交换空间块不是必需的。 But if you are unsure, remove all swap space and recreate it , which should remove any fragmentation. 但是如果您不确定,请删除所有交换空间并重新创建它 ,这应该删除任何碎片。

Note 2: I have seen several posts like this one reporting 0MB swap space and being able to run the JVM. 注2:我见过几个职位像这样一个报告0MB交换空间, 能够运行JVM。 That is probably due to the fact that the JVM increases the swap space itself. 这可能是由于JVM增加了交换空间本身。 Still doesn't hurt to try to increase the swap space by hand to find out whether it fixes your issue. 尝试手动增加交换空间以确定它是否能解决您的问题仍然没有坏处。

Premature conclusion 结论不成熟

I realize that non of the above is an out-of-the-box answer to your question. 我意识到上述情况不是你问题的开箱即用的答案。 I hope it gives you some pointers though to what you can try to get your JVM working. 我希望它能为您提供一些指导,但您可以尝试使JVM正常工作。 You might also try other JVM's, if the problem turns out to be a limit of the JVM you are currently using, but from what I have read so far, no limit should be imposed for 64 bit JVM's. 您可能还尝试其他JVM,如果问题证明是您当前使用的JVM的限制,但从我到目前为止所读到的,对64位JVM不应施加限制。

That you get the error right on initialization of the JVM leads me to believe that the problem is not with the JVM, but with the OS not being able to comply to the reservation of the 300GB of memory. 你在初始化JVM时得到的错误让我相信问题不在于JVM,而在于操作系统无法满足300GB内存的预留。

My own tests showed that the JVM can access all virtual memory, and doesn't care about the amount of physical memory available. 我自己的测试表明,JVM可以访问所有虚拟内存,而不关心可用的物理内存量。 It would be odd if the virtual memory is lower than the physical memory, but the VmAllocChunk setting should give you a hint in that direction (it is usually much larger). 如果虚拟内存低于物理内存,那将是奇怪的,但VmAllocChunk设置应该给你一个方向提示(它通常要大得多)。

If you have a look at the FAQ section of Java HotSpot VM, its mentioned that on 64-bit VMs, there are only 64 address bits to work with and hence the maximum Java heap size is dependent on the amount of physical memory & swap space present on the system. 如果您查看Java HotSpot VM的FAQ部分 ,它提到在64位VM上,只有64个地址位可供使用,因此最大Java堆大小取决于物理内存和交换空间的大小出现在系统上。

If you calculate theoretically then you can have a memory of 18446744073709551616 MB , but there are above limitation to it. 如果你在理论上计算然后你可以有18446744073709551616 MB的内存,但它有上述限制。

You have to use -Xmx command to define maximum heap size for JVM, By default , Java uses 64 + 30% = 83.2MB on 64-bit JVMs. 您必须使用-Xmx命令为JVM定义最大堆大小。 默认情况下 ,Java在64位JVM上使用64 + 30%= 83.2MB。

I tried below command on my machine and it looked to work fine. 我在我的机器上尝试了以下命令,它看起来工作正常。

java -Xmx500g com.test.TestClass

I also tried to define maximum heap in terabytes but it doesn't work. 我也尝试用TB来定义最大堆,但它不起作用。

Run ulimit -a as the JVM Process's user and verify that your kernel isn't limiting your max memory size. 运行ulimit -a作为JVM Process的用户,并验证您的内核不限制您的最大内存大小。 You may need to edit /etc/security/limit.conf 您可能需要编辑/etc/security/limit.conf

According to this discussion , LSF does not pool node memory into a single shared space. 根据此讨论 ,LSF不会将节点内存池化为单个共享空间。 You are using something else for that. 你正在使用别的东西。 Read that something's documentation, because it is possible it cannot do what you are asking it to do. 阅读那些东西的文档,因为它可能无法完成你要求它做的事情。 In particular, it may not be able to allocate a single contiguous region of memory that spans all the nodes. 特别是,它可能无法分配跨越所有节点的单个连续内存区域。 Usually that's not necessary, as an application will make many calls to malloc. 通常这不是必需的,因为应用程序会对malloc进行多次调用。 But the JVM, to simplify things for itself, wants to allocate (or reserve) a single contiguous region for the entire heap by effectively calling malloc just once. 但是JVM为自己简化了事情,希望通过有效地调用malloc一次为整个堆分配(或保留)一个连续的区域。 Or it could be something else related to whatever you are using to emulate a giant shared memory machine. 或者它可能与您用来模拟巨型共享内存机器的其他内容相关。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM