简体   繁体   English

如何保存java程序的状态并在以后获取?

[英]How do I save the state of a java program and pick it up later?

I am trying to run a java program that uses WEKA libraries on a cluster . 我正在尝试运行在cluster上使用WEKA libraries的java程序。

This cluster times out submitted jobs after 12 hours, and I can't change this fact because I am a student and not in charge of the cluster. 这个集群在12小时后超时提交了作业,我无法改变这个事实,因为我是一名学生,不负责集群。

What I want to do is save the state of my JVM, and reload it. 我想要做的是保存我的JVM的状态,并重新加载它。 Basically close the program for a time, and pick up where I left off. 基本上关闭程序一段时间,然后从我离开的地方继续。

Is this possible? 这可能吗?

I don't think I can (easily at least) output the state of the variables in the WEKA objects themselves to a file with OOS and reload them because I'm using the WEKA libraries, and it would be extremely complicated to rewrite the code for these machine learning programs. 我认为我不能(很容易)将WEKA对象本身的变量状态输出到带有OOS的文件并重新加载它们,因为我使用的是WEKA库,重写代码会非常复杂对于这些机器学习计划。 (though that might be what I have to do) (虽然这可能是我必须要做的)

I tried using a library called javaflow that I thought from reading around might accomplish this, but I cannot get it to work. 我尝试使用一个名为javaflow的库,我从阅读中想到可能会实现这一点,但我无法让它工作。 When try to do its counting example I am met with this error: 当尝试进行计数示例时,我遇到了这个错误:

Apr 20, 2016 9:15:12 PM org.apache.commons.javaflow.bytecode.StackRecorder execute
SEVERE: stack corruption. Is class test_javaflow.MyRunnable instrumented for javaflow?
java.lang.IllegalStateException: stack corruption. Is class test_javaflow.MyRunnable instrumented for javaflow?
    at org.apache.commons.javaflow.bytecode.StackRecorder.execute(StackRecorder.java:102)
    at org.apache.commons.javaflow.Continuation.continueWith(Continuation.java:170)
    at org.apache.commons.javaflow.Continuation.startWith(Continuation.java:129)
    at org.apache.commons.javaflow.Continuation.startWith(Continuation.java:102)
    at test_javaflow.Test_Javaflow.main(Test_Javaflow.java:16)

Googling this error come up with a few pages relating to something called JasperSoft, which I'm fairly certain isn't what I'm looking for. 谷歌搜索这个错误会出现几个与JasperSoft相关的页面,我相当肯定这不是我想要的。

Have a look at docker's checkpoint command. 看看docker的checkpoint命令。 It provides the ability to save the current state of a docker container and then resume it. 它提供了保存docker容器的当前状态然后恢复它的能力。 I've been using it for a similar use - a JVM based system. 我一直在使用它来进行类似的使用 - 基于JVM的系统。 In my case I use it because initialization takes a long time. 在我的情况下,我使用它,因为初始化需要很长时间。 By using a checkpoint I can restart the container at a known state multiple times. 通过使用检查点,我可以多次以已知状态重新启动容器。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM