如果我需要的内存超过Java堆中的内存，该怎么办？

Question

I have a graph algorithm that generates intermediate results associated to different nodes. 我有一个图算法，可以生成与不同节点关联的中间结果。 Currently, I have solved this by using a ConcurrentHashMap<Node, List<Result> (I am running multithreaded). 目前，我已经通过使用ConcurrentHashMap<Node, List<Result>解决了这个问题（我正在运行多线程）。 So at first I add new results with map.get(node).add(result) and then I consume all results for a node at once with map.get(node) . 因此，首先我使用map.get(node).add(result)添加新结果，然后使用map.get(node)一次使用一个节点的所有结果。

However, I need to run on a pretty large graph where the number of intermediate results wan't fit into memory (good old OutOfMemory Exception). 但是，我需要在一个很大的图上运行，其中中间结果的数量将不适合内存（很好的旧OutOfMemory异常）。 So I require some solution to write out the results on disk—because that's where there is still space. 因此，我需要一些解决方案在磁盘上写出结果，因为那是仍有空间的地方。

Having looked at a lot of different "off-heap" maps and caches as well as MapDB I figured they are all not a fit for me. 在查看了许多不同的“堆外”映射和缓存以及MapDB之后，我发现它们都不适合我。 All of them don't seem to support Multimaps (which I guess you can call my map) or mutable values (which the list would be). 它们似乎都不支持Multimaps（我想您可以称呼我的地图）或可变值（列表就是）。 Additionally, MapDB has been very slow for me when trying to create a new collection for every node (even with a custom serializer based on FST ). 另外，在尝试为每个节点创建新集合时（即使使用基于FST的自定义序列化程序），MapDB对于我来说也非常慢。

I can barely imagine, though, that I am the first and only to have such a problem. 但是，我几乎无法想象我是第一个，也是唯一一个遇到这种问题的人。 All I need is a mapping from a key to a list which I only need to extend or read as a whole. 我所需要的只是一个从键到列表的映射，我只需要扩展或整体读取即可。 What would an elegant and simple solution look like? 一个优雅而简单的解决方案是什么样的？ Or are there any existing libraries that I can use for this? 还是我可以使用任何现有的库？

Thanks in advance for saving my week :). 在此先感谢您保存我的一周:)。

EDIT 编辑
I have seen many good answers, however, I have two important constraints: I don't want to depend on an external database (eg Redis) and I can't influence the heap size. 我看到了许多很好的答案，但是，我有两个重要的限制条件：我不想依赖外部数据库（例如Redis），并且我不能影响堆大小。

Answer 1

My recollection is that the JVM runs with a small initial max heap size. 我的记忆是，JVM的初始最大堆大小很小。 If you use the -Xmx10000m you can tell the JVM to run with a 10,000 MB (or whatever number you selected) heap. 如果使用-Xmx10000m，则可以告诉JVM使用10,000 MB（或所选的任何数量）堆运行。 If your underlying OS resources support a larger heap that might work. 如果您的基础操作系统资源支持可能会起作用的更大堆。

Answer 2

You can increase the size of heap. 您可以增加堆的大小。 The size of heap can be configured to larger than physical memory size of your server while you make sure the condition is right: 在确保条件正确的同时，可以将堆大小配置为大于服务器的物理内存大小：
```
 the size of heap + the size of other applications < the size of physical memory + the size of swap space 
```
For instance, if the physical memory is 4G and the swap space is 4G, the heap size can be configured to 6G. 例如，如果物理内存为4G，交换空间为4G，则堆大小可以配置为6G。
But the program will suffer from page swapping. 但是该程序将遭受页面交换的困扰。
You can use some database like Redis . 您可以使用Redis之类的数据库。 Redis is key-value database and has List structure. Redis是键值数据库，具有列表结构。
I think this is the simplest way to solve your problem. 我认为这是解决问题的最简单方法。
You can compress the Result instance. 您可以压缩Result实例。 First, you serialize the instance and compress that. 首先，您序列化实例并对其进行压缩。 And define the class: 并定义类：
```
 class CompressResult { byte[] result; //... } 
```
And replace the Result to CompressResult. 并将结果替换为CompressResult。 But you should deserialize the result when you want to use it. 但是，当您要使用结果时，应该对结果进行反序列化。
It will work well if the class Result has many fields and is very complicated. 如果Result类具有很多字段并且非常复杂，它将很好地工作。

如果我需要的内存超过Java堆中的内存，该怎么办？

问题描述

2 个解决方案

解决方案1
1 2019-03-15 13:07:32

解决方案2
1 已采纳 2019-03-15 15:40:26

如果我需要的内存超过Java堆中的内存，该怎么办？

问题描述

2 个解决方案

解决方案1 1 2019-03-15 13:07:32

解决方案2 1 已采纳 2019-03-15 15:40:26

解决方案1
1 2019-03-15 13:07:32

解决方案2
1 已采纳 2019-03-15 15:40:26