简体   繁体   English

如何在两个JVM实例之间共享内存?

[英]How can I share memory between two JVM instances?

I build a huge graph in JVM (Scala) which I want to use repeatedly, tweaking algorithms. 我在JVM(Scala)中构建了一个巨大的图形,我想重复使用它,调整算法。 I'd rather not reload it each time from disk. 我不想每次都从磁盘重装它。 Is there a way to have it sit in one JVM while connecting from another, where the algorithms are being developed? 有没有办法让它在一个JVM中连接而从另一个JVM连接,算法正在开发中?

Save your graph to disk, then map it into memory with MappedByteBuffer . 将图形保存到磁盘,然后使用MappedByteBuffer将其映射到内存中。 Both processes should use the same memory, which will be shared with the page cache. 两个进程都应使用相同的内存,这些内存将与页面缓存共享。

Two JVMs sounds more complicated than it needs to be. 两个JVM听起来比它需要的更复杂。 Have you considered doing a kind of "hot deploy" setup, where your main program loads up the graph, displays the UI, and then asks for (or automatically looks for) a jar/class file to load that contains your actual algorithm code? 您是否考虑过进行一种“热部署”设置,主程序加载图形,显示UI,然后请求(或自动查找)要加载的jar / class文件,其中包含您的实际算法代码? That way your algorithm code would be running in the same jvm as your graph, but you wouldn't have to reload the graph just to reload a new algorithm implementation. 这样,您的算法代码将在与图形相同的jvm中运行,但您不必重新加载图形只是为了重新加载新的算法实现。

UPDATE to address OP's question in comment: 更新以解决OP的问题:

Here's how you could structure your code so that your algorithms would be swappable. 以下是如何构建代码以使您的算法可以交换的方法。 It doesn't matter what the various algorithms do, so long as they are operating on the same input data. 各种算法的作用并不重要,只要它们在相同的输入数据上运行即可。 Just define an interface like the following, and have your graph algorithms implement it. 只需定义如下所示的界面,并让您的图算法实现它。

public interface GraphAlgorithm {
  public void doStuff(Map<whatever> myBigGraph)
}

If your algorithms are displaying results to some kind of widget, you could pass that in as well, or have doStuff() return some kind of results object. 如果您的算法将结果显示给某种窗口小部件,您也可以将其传递给,或者让doStuff()返回某种结果对象。

Did you consider OSGi platform? 您是否考虑过OSGi平台? It lives in a single JVM, but will allow you to upgrade bundles with algorithms without platform restart. 它位于单个JVM中,但允许您在没有平台重启的情况下使用算法升级捆绑包。 Thus you may have a long-term running bundle with your huge data structures and short-term algorithm bundles taking access to the data. 因此,您可能拥有一个长期运行的捆绑包,其中包含大量数据结构和短期算法捆绑包,可以访问数据。

Using RMI perhaps? 或许使用RMI? Have one instance working as server and the rest as clients? 有一个实例作为服务器,其余实例作为客户端?

I think it would be much more complicated than reloading from disk. 我认为这比从磁盘重新加载要复杂得多。

You can certainly create an interface onto it and expose it via (say) RMI . 您当然可以在其上创建一个接口并通过(例如) RMI公开它。

My initial thoughts on reading your post, however, are 但是,我对阅读你的帖子的初步想法是

  1. just how big is this graph ? 这张图有多大?
  2. is it possible to optimise your loading procedure instead ? 是否可以优化您的装载程序?

I know LinkedIn have a vast graph of people and connections that is held in memory all the time and that takes several hours to reload. 我知道LinkedIn有一个庞大的人员和连接图表,这些图表一直存在于内存中,需要几个小时才能重新加载。 But I figure that's a truly exceptional case. 但我认为这是一个非常特殊的案例。

If is expensive to build your graph maybe you can serialize the object. 如果构建图形的成本很高,也许可以序列化对象。

ByteArrayOutputStream bos = new ByteArrayOutputStream();
        ObjectOutputStream out = new ObjectOutputStream(bos);
        out.writeObject(graph);
        out.flush();
        byte b[] = bos.toByteArray();
//you can use FileOutputStream instead of a ByteArrayOutputStream

Then you can build your object from the file 然后,您可以从文件中构建对象

ByteArrayInputStream inputBuffer = new ByteArrayInputStream(b);
        ObjectInputStream inputStream = new ObjectInputStream(inputBuffer);
        try {
            Graph graph = (Graph) inputStream.readObject();

        } finally {
            if (inputStream != null) {
                inputStream.close();
            }
        }

Just replace the ByteArrayInputStream with a FileInputStream 只需用FileInputStream替换ByteArrayInputStream即可

if the problem is just to dynamicly load and run your code without name clashes a custom class loader could be enough. 如果问题只是动态加载和运行代码而没有名称冲突,那么自定义类加载器就足够了。 for a new run just cache all class files in a new classloader. 对于新的运行,只需将所有类文件缓存到新的类加载器中。

您是否考虑过使用少量样本数据来测试算法?

Terracotta在许多JVM实例之间共享内存,因此您可以轻松地将群集应用于您的系统。

Woo! 呜! late to the party. 迟到了。

If its on a local machine, similar to mapped byte buffers, there is apache direct memory. 如果它在本地机器上,类似于映射的字节缓冲区,则有apache直接内存。 http://directmemory.apache.org/ http://directmemory.apache.org/

If you want it distributed, give http://hazelcast.org/ a try. 如果您希望分发, 尝试http://hazelcast.org/ Its used by a lot of large projects. 它被很多大型项目所使用。 Of course, your objects must be serializable. 当然,您的对象必须是可序列化的。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM