简体   繁体   English

如何估计JVM是否有足够的可用内存用于特定数据结构?

[英]How to estimate if the JVM has enough free memory for a particular data structure?

I have the following situation: there are a couple of machines forming a cluster. 我有以下情况:有几台机器组成一个集群。 Clients can load data-sets and we need to select the node on which the dataset will be loaded and refuse to load / avoid an OOM error if there is no one machine which could fit the dataset. 客户端可以加载数据集,我们需要选择将加载数据集的节点,如果没有一台机器可以适合数据集,则拒绝加载/避免OOM错误。

What we do currently: we now the entry count in the dataset and estimate the memory to be used as entry count * empirical factor (determined manually). 我们当前所做的事情:我们现在是数据集中的entry count ,并估计memory to be used entry count * empirical factormemory to be used entry count * empirical factor (手动确定)。 Then check if this is lower than free memory (got by Runtime.freeMemory() ) and if so, load it (otherwise redo the process on other nodes / report that there is no free capacity). 然后检查它是否低于空闲内存(由Runtime.freeMemory()得到),如果是,则加载它(否则重做其他节点上的进程/报告没有空闲容量)。

The problems with this approach are: 这种方法的问题是:

  • the empirical factor needs to be revisited and updated manually 需要手动重新审视和更新empirical factor
  • freeMemory sometimes may underreport because of some non-cleaned-up garbage (which could be avoided by running System.gc before each such call, however that would slow down the sever and also potentially lead to premature promotion) freeMemory有时可能由于某些未清理的垃圾而报告不足(可以通过在每次调用之前运行System.gc来避免,但这会降低服务器的速度并且还可能导致过早的升级)
  • an alternative would be to "just try to load the dataset" (and back out if an OOM is thrown) however once an OOM is thrown, you potentially corrupted other threads running in the same JVM and there is no graceful way of recovering from it. 另一种方法是“只是尝试加载数据集”(如果抛出OOM则返回)但是一旦抛出OOM,你可能会破坏在同一个JVM中运行的其他线程,并且没有优雅的方法从中恢复。

Are there better solutions to this problem? 这个问题有更好的解决方案吗?

The empirical factor can be calculated as build step and placed in a properties file. empirical factor可以计算为构建步骤并放置在属性文件中。

While freeMemory() is almost always less than the amount which would be free after a GC, you can check it to see if it is available and call a System.gc() if the maxMemory() indicates there might be plenty. 虽然freeMemory()几乎总是小于GC之后可以释放的数量,但是如果maxMemory()表明可能有很多,你可以检查它是否可用并调用System.gc()

NOTE: Using System.gc() in production only makes in very rare situations and in general it often incorrectly used resulting in a reduction in performance and obscuring the real problem. 注意:在生产中使用System.gc()仅在非常罕见的情况下进行,并且通常它经常被错误地使用,从而导致性能降低并使实际问题变得模糊。

I would avoid triggering an OOME unless you are running is a JVM you can restart as required. 我会避免触发OOME,除非你正在运行的是一个JVM,你可以根据需要重启。

My solution: 我的解决方案

  1. Set the Xmx as 90%-95% of RAM of physical machine if no other process is running except your program. 如果除了您的程序之外没有其他进程在运行,请将Xmx设置为物理机RAM的90%-95% For 32 GB RAM machine, set Xmx as 27MB - 28MB . 对于32 GB RAM机器,将Xmx设置为27MB - 28MB

  2. Use one of good gc algorithms - CMS or G1GC and fine tune relevant parameters. 使用良好的gc算法之一 - CMSG1GC并微调相关参数。 I prefer G1GC if you need more than 4 GB RAM for your application . I prefer G1GC if you need more than 4 GB RAM for your application Refer to this question if you chose G1GC: 如果您选择G1GC,请参阅此问题:

    Agressive garbage collector strategy 积极的垃圾收集策略

    Reducing JVM pause time > 1 second using UseConcMarkSweepGC 使用UseConcMarkSweepGC减少JVM暂停时间> 1秒

  3. Calculate Cap on memory usage by yourself instead of checking free memory. 自己计算内存使用量上限而不是检查空闲内存。 Add used memory and memory to be allocated. 添加要分配的已用内存和内存。 Subtract it from your own cap like 90% of Xmx . Subtract it from your own cap like 90% of Xmx If you still have available memory, grant memory allocation request. 如果仍有可用内存,则授予内存分配请求。

An alternative approach is to isolate each data-load in its own JVM. 另一种方法是隔离自己的JVM中的每个数据负载。 You just predefine each JVM's max-heap-size and so on, and set the number of JVMs per host in such a way that each JVM can take up its full max-heap-size. 您只需预定义每个JVM的最大堆大小等,并设置每个主机的JVM数量,使每个JVM可以占用其完整的最大堆大小。 This will use a bit more resources — it means you can't make use of every last byte of memory by cramming in more low-memory data-loads — but it massively simplifies the problem (and reduces the risk of getting it wrong), it makes it feasible to tell when/whether you need to add new hosts, and most importantly, it reduces the impact that any one client can have on all other clients. 这将使用更多的资源 - 这意味着你不能通过填充更多的低内存数据负载来利用每个内存的最后一个字节 - 但它大大简化了问题(并降低了出错的风险),它可以告诉您何时/是否需要添加新主机,最重要的是,它可以减少任何一个客户端对所有其他客户端的影响。

With this approach, a given JVM is either "busy" or "available". 使用这种方法,给定的JVM要么“忙”,要么“可用”。

After any given data-load completes, the relevant JVM can either declare itself available for a new data-load, or it can just close. 在任何给定的数据加载完成之后,相关的JVM可以声明自己可用于新的数据加载,或者它可以关闭。 (Either way, you'll want to have a separate process to monitor the JVMs and make sure that the right number are always running.) (无论哪种方式,您都希望有一个单独的进程来监视JVM并确保正确的数字始终在运行。)

an alternative would be to "just try to load the dataset" (and back out if an OOM is thrown) however once an OOM is thrown, you potentially corrupted other threads running in the same JVM and there is no graceful way of recovering from it. 另一种方法是“只是尝试加载数据集”(如果抛出OOM则返回)但是一旦抛出OOM,你可能会破坏在同一个JVM中运行的其他线程,并且没有优雅的方法从中恢复。

There isn't good ways to handle and recover from OOME in JVM but there is way to react before OOM happens. 在JVM中没有很好的方法来处理和从OOME中恢复,但是 OOM发生之前有办法做出反应。 Java has java.lang.ref.SoftReference which is guaranteed to have been cleared before the virtual machine throws an OutOfMemoryError . Java有java.lang.ref.SoftReference, 保证在虚拟机抛出OutOfMemoryError之前清除它 This fact can be used for early prediction of OOM. 这个事实可以用于OOM的早期预测。 For example data load can be aborted if prediction triggered. 例如,如果预测被触发,则可以中止数据加载。

    ReferenceQueue<Object> q = new ReferenceQueue<>();
    SoftReference<Object> reference = new SoftReference<>(new Object(), q);
    q.remove();
    // reference removed - stop data load immediately

Sensitivity can be tuned with -XX:SoftRefLRUPolicyMSPerMB flag (for Oracle JVM). 可以使用-XX:SoftRefLRUPolicyMSPerMB标志(对于Oracle JVM)调整灵敏度。 Solution not ideal, it effectiveness depends on various factors - do other soft references used in code, how GC tuned, JVM version, weather on Mars... But it can help if you lucky. 解决方案并不理想,它的有效性取决于各种因素 - 代码中使用的其他软参考,GC如何调整,JVM版本,火星上的天气......但如果你幸运的话它可以帮助。

As you rightly noted, using freeMemory will not tell you the amount of memory that can be freed by Java Garbage Collection. 正如您所正确指出的那样,使用freeMemory不会告诉您Java垃圾收集可以释放的内存量。 You could run load tests and understand the JVM heap usage pattern and memory allocation, de-allocation pattern using tools like JConsole, VisualVM, jstat and printGCStats option to JVM. 您可以使用JVM的JConsole,VisualVM,jstat和printGCStats选项等工具运行负载测试并了解JVM堆使用模式和内存分配, printGCStats分配模式。 This will give an idea about calculating the empirical factor more accurately, basically understand what is the load pattern your java application can handle. 这将给出更准确地计算empirical factor的想法,基本上了解您的java应用程序可以处理的加载模式是什么。 Next would be to do choose the right GC and tune basic GC settings for better efficiency. 接下来将选择正确的GC并调整基本GC设置以提高效率。 This is not a quick solution, but maybe in the long-term a better solution. 这不是一个快速的解决方案,但从长远来看可能是一个更好的解决方案。

The other way to be kill your JVM with -XX:OnOutOfMemoryError="kill -9 %p" JVM setting, once OOM happens, and then write, resue a simple process monitoring script to bring up your JVM if it is down. 使用-XX杀死JVM的另一种方法:OnOutOfMemoryError =“kill -9%p” JVM设置,一旦发生OOM,然后写入,就会生成一个简单的进程监视脚本,以便在JVM关闭时启动它。

Clients can load data-sets and we need to select the node on which the dataset will be loaded and refuse to load / avoid an OOM error if there is no one machine which could fit the dataset. 客户端可以加载数据集,我们需要选择将加载数据集的节点,如果没有一台机器可以适合数据集,则拒绝加载/避免OOM错误。

This is a job scheduling problem ie I have finite resources how do we best utilize them. 这是一个工作调度问题,即我有限的资源我们如何最好地利用它们。 I'll get the OOM issue near the end. 我会在接近结束时收到OOM问题。

We have one of the main factors ie RAM but solutions to scheduling problems are dependent on many factors ie... 我们有一个主要因素,即RAM,但调度问题的解决方案取决于许多因素,即......

  1. Are the jobs small or large ie are there hundreds/thousands of these running on a node or two or three. 作业是小还是大,即在节点上运行数百/数千个或两个或三个。 Think Linux scheduler. 想想Linux调度程序。

  2. Do they need to complete in a particular time frame? 他们需要在特定的时间框架内完成吗? Realtime Scheduler. 实时调度程序。

Given everything we know at the start of a job can we predict when a job will end with within some time frame? 鉴于我们在工作开始时所知道的一切,我们可以预测一份工作何时会在一段时间内结束? If we can predict that on Node X we free up 100MB every 15 - 20 seconds we have a way to schedule a 200Mb job on that node ie I'm confident that in 40 seconds I'll have completed 200Mb of space on that node and the 40 seconds is an acceptable limit for the person or machine submitting the job. 如果我们可以预测在节点X上我们每15到20秒释放100MB,我们就可以在该节点上安排200Mb的工作,即我相信在40秒内我将在该节点上完成200Mb的空间并且40秒是提交作业的人员或机器的可接受限制。

Lets assume that we have a function as follows. 让我们假设我们有如下函数。

predicted_time predict(long bytes[, factors]); 

The factors are the other things we would need to take into consideration that I mentioned above and for every application there will be things you can add to suit your scenario. 这些factors是我们需要考虑的其他因素,我在上面提到过,对于每个应用程序,都会有一些东西可以添加以适合您的场景。

The factors would be given weights when calculating predicted_time . 这些因素将计算时,可以任意的加权predicted_time

predicted_time is the number of milliseconds (can be any TimeUnit) that this node believes from now that it can service this Task, the node giving you the smallest number is the node the job should be scheduled on. predicted_time是(可以是任何TIMEUNIT),这点从现在开始,它可以服务于这个任务,节点给你最小的数字相信的毫秒数是工作应安排的节点。 You could then use this function as follows where we have a queue of tasks ie, in the following code this.nodes[i] represents a JVM instance. 然后,您可以使用此函数,我们有一个任务队列,即在下面的代码中this.nodes[i]表示一个JVM实例。

private void scheduleTask() {
  while(WorkEvent()) {
        while(!this.queue.isEmpty()) {
            Task t = this.queue.poll();
            for (int i = 0; i < this.maxNodes; i++) {
                long predicted_time = this.nodes[i].predict(t);
                if (predicted_time < 0) {
                    boolean b = this.queue.offer(t);
                    assert(b);
                    break;
                }
                if (predicted_time <= USER_EXPERIENCE_DELAY) {
                    this.nodes[i].addTask(t);
                    break;
                }
                alert_user(boolean b = this.queue.offer(t);
                assert(b);
            }
        }
    }
}

If predicted_time < 0 we have an error, we reschedule the job, in reality we'd like to know why but that's not difficult to add. 如果predicted_time < 0我们有错误,我们重新安排工作,实际上我们想知道为什么,但这并不难添加。 If the predicted_time <= USER_EXPERIENCE_DELAY the job can be scheduled. 如果predicted_time <= USER_EXPERIENCE_DELAY ,则可以安排作业。

How does this avoid an OOM 这是如何避免OOM的

We can gather any statistics we want from our scheduler ie how many jobs of size X where scheduled correctly, the aim would be to reduce the errors and to make it more reliable over time ie reduce the amount of times we tell a customer that their job cannot be serviced. 我们可以从我们的调度程序中收集我们想要的任何统计数据,即正确安排的X大小的工作量,目标是减少错误并使其随着时间的推移更可靠,即减少我们告诉客户他们的工作的次数无法提供服务。 What we've done is reduce the problem to something we can statistically improve towards an optimal solution. 我们所做的就是将问题减少到我们可以在统计上改进的最佳解决方案。

Clients can load data-sets and we need to select the node on which the dataset will be loaded and refuse to load / avoid an OOM error if there is no one machine which could fit the dataset. 客户端可以加载数据集,我们需要选择将加载数据集的节点,如果没有一台机器可以适合数据集,则拒绝加载/避免OOM错误。

This is a job scheduling problem ie I have finite resources how do we best utilize them. 这是一个工作调度问题,即我有限的资源我们如何最好地利用它们。 I'll get the OOM issue near the end. 我会在接近结束时收到OOM问题。

We have one of the main factors ie RAM but solutions to scheduling problems are dependent on many factors ie... 我们有一个主要因素,即RAM,但调度问题的解决方案取决于许多因素,即......

  1. Are the jobs small or large ie are there hundreds/thousands of these running on a node or two or three. 作业是小还是大,即在节点上运行数百/数千个或两个或三个。 Think Linux scheduler. 想想Linux调度程序。

  2. Do they need to complete in a particular time frame? 他们需要在特定的时间框架内完成吗? Realtime Scheduler. 实时调度程序。

Given everything we know at the start of a job can we predict when a job will end within some time frame? 鉴于我们在工作开始时所知道的一切,我们可以预测一份工作何时会在一段时间内结束? If we can predict that on Node X we free up 100MB every 15 - 20 seconds we have a way to schedule a 200Mb job on that node ie I'm confident that in 40 seconds I'll have completed 200Mb of space on that node and the 40 seconds is an acceptable limit for the person or machine submitting the job. 如果我们可以预测在节点X上我们每15到20秒释放100MB,我们就可以在该节点上安排200Mb的工作,即我相信在40秒内我将在该节点上完成200Mb的空间并且40秒是提交作业的人员或机器的可接受限制。

Lets assume that we have a function as follows. 让我们假设我们有如下函数。

predicted_time predict(long bytes[, factors]); 

The factors are the other things we would need to take into consideration that I mentioned above and for every application there will be things you can add to suit your scenario ie how many factors is up to you. 这些factors是我们需要考虑的其他因素,我在上面提到过,对于每个应用程序,都会有一些东西可以添加以适应您的场景,即有多少因素取决于您。

The factors would be given weights when calculating predicted_time . 这些因素将计算时,可以任意的加权predicted_time

predicted_time is the number of milliseconds (can be any TimeUnit) that this node believes from now that it can service this Task, the node giving you the smallest number is the node the job should be scheduled on. predicted_time是(可以是任何TIMEUNIT),这点从现在开始,它可以服务于这个任务,节点给你最小的数字相信的毫秒数是工作应安排的节点。 You could then use this function as follows where we have a queue of tasks ie, in the following code this.nodes[i] represents a JVM instance. 然后,您可以使用此函数,我们有一个任务队列,即在下面的代码中this.nodes[i]表示一个JVM实例。

private void scheduleTask() {
  while(WorkEvent()) {
        while(!this.queue.isEmpty()) {
            Task t = this.queue.poll();
            for (int i = 0; i < this.maxNodes; i++) {
                long predicted_time = this.nodes[i].predict(t);
                if (predicted_time < 0) {
                    boolean b = this.queue.offer(t);
                    assert(b);
                    break;
                }
                if (predicted_time <= USER_EXPERIENCE_DELAY) {
                    this.nodes[i].addTask(t);
                    break;
                }
                alert_user(boolean b = this.queue.offer(t);
                assert(b);
            }
        }
    }
}

If predicted_time < 0 we have an error, we reschedule the job, in reality we'd like to know why but that's not difficult to add. 如果predicted_time < 0我们有错误,我们重新安排工作,实际上我们想知道为什么,但这并不难添加。 If the predicted_time <= USER_EXPERIENCE_DELAY the job can be scheduled. 如果predicted_time <= USER_EXPERIENCE_DELAY ,则可以安排作业。

How does this avoid an OOM 这是如何避免OOM的

We can gather any statistics we want from our scheduler ie how many jobs of size X where scheduled correctly, the aim would be to reduce the errors and to make it more reliable over time ie reduce the amount of times we tell a customer that their job cannot be serviced or it failed. 我们可以从我们的调度程序中收集我们想要的任何统计数据,即正确安排的X大小的工作量,目标是减少错误并使其随着时间的推移更可靠,即减少我们告诉客户他们的工作的次数无法维修或失败。 What we've done or at least are trying to attempt is to reduce the problem to something we can statistically improve upon towards an optimal solution. 我们已经或至少试图尝试的是将问题减少到我们可以在实现最佳解决方案时进行统计改进的问题。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM