简体   繁体   English

Java / Python中的快速IPC / Socket通信

[英]Fast IPC/Socket communication in Java/Python

Two processes (Java and Python) need to communicate in my application. 两个进程(Java和Python)需要在我的应用程序中进行通信。 I noticed that the socket communication takes 93% of the run time. 我注意到套接字通信占用了93%的运行时间。 Why is communication so slow? 为什么沟通这么慢? Should I be looking for alternatives to socket communication or can this be made faster? 我应该寻找套接字通信的替代方案,还是可以加快速度?

Update: I discovered a simple fix. 更新:我发现了一个简单的修复。 It seems like the Buffered output stream is not really buffered for some unknown reason. 似乎缓冲输出流由于某种未知原因而没有真正缓冲。 So, I now put all data into string buffers in both client/server processes. 所以,我现在将所有数据放入客户端/服务器进程的字符串缓冲区中。 I write it to the socket in the flush method. 我在flush方法中将它写入套接字。

I'm still interested in an example of the usage of shared memory to exchange data quickly between processes. 我仍然对使用共享内存在进程之间快速交换数据的示例感兴趣。

Some additional information: 一些其他信息:

  1. Message size in the applicaiton is under 64kb most of the time. 应用程序中的消息大小大多数时间不到64kb。
  2. The server is in Java, the client is written in Python. 服务器是Java,客户端是用Python编写的。
  3. Socket IPC is implemented below: it takes 50 cycles sending 200 bytes ! 套接字IPC在下面实现:它需要50个周期发送200个字节! This has got to be too high. 这必须太高了。 If I send 2 bytes in 5000 cycles, it takes a lot less time. 如果我在5000个周期内发送2个字节,则需要的时间要少得多。
  4. Both processes run on one Linux machine. 这两个进程都在一台Linux机器上运行。
  5. In the real application about 10 calls to client's iFid.write() are made each cycle. 在实际应用程序中,每个周期都会对客户端的iFid.write()进行大约10次调用。
  6. This is done on a Linux system. 这是在Linux系统上完成的。

This is the server side: 这是服务器端:

public class FastIPC{
    public PrintWriter out;
    BufferedReader in;
    Socket socket = null;
    ServerSocket serverSocket = null;


    public FastIPC(int port) throws Exception{
        serverSocket = new ServerSocket(port);
        socket = serverSocket.accept();
        out = new PrintWriter(new BufferedWriter(new OutputStreamWriter(socket.getOutputStream())), true);
        in = new BufferedReader(new InputStreamReader(socket.getInputStream()));
    }

    public void send(String msg){
        out.println(msg); // send price update to socket
    }

    public void flush(){
        out.flush();
    }

    public String recv() throws Exception{
        return in.readLine();
    }

    public static void main(String[] args){
        int port = 32000;
        try{
            FastIPC fip = new FastIPC(port);
            long start = new Date().getTime();
            System.out.println("Connected.");
            for (int i=0; i<50; i++){
                for(int j=0; j<100; j++)
                    fip.send("+");
                fip.send(".");
                fip.flush();
                String msg = fip.recv();
            }
            long stop = new Date().getTime();
            System.out.println((double)(stop - start)/1000.);
        }catch(Exception e){
            System.exit(1);
        }
    }
}

And the client side is: 客户端是:

import sys
import socket

class IPC(object):
    def __init__(self):
        self.s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
        self.s.connect(("localhost", 32000))
        self.fid = self.s.makefile() # file wrapper to read lines
        self.listenLoop() # wait listening for updates from server

    def listenLoop(self):
        fid = self.fid
        print "connected"
        while True:
            while True:
                line = fid.readline()
                if line[0]=='.':
                    break
            fid.write('.\n')
            fid.flush()

if __name__ == '__main__':
    st = IPC()

You have a number of options. 你有很多选择。 Since you are using Linux you could use UNIX domain sockets. 由于您使用的是Linux,因此可以使用UNIX域套接字。 Or, you could serialise the data as ASCII or JSon or some other format and feed it through a pipe, SHM (shared memory segment), message queue, DBUS or similar. 或者,您可以将数据序列化为ASCII或JSon或其他格式,并通过管道,SHM(共享内存段),消息队列,DBUS或类似方式提供数据。 It's worth thinking about what sort of data you have, as these IPC mechanisms have different performance characteristics. 值得思考您拥有的数据类型,因为这些IPC机制具有不同的性能特征。 There's a draft USENIX paper with a good analysis of the various trade-offs that is worth reading. 一份USENIX论文草案 ,对各种值得一读的权衡进行了很好的分析。

Since you say (in the comments to this answer) that you prefer to use SHM, here are some code samples to start you off. 既然你说(在这个答案的评论中)你更喜欢使用SHM,这里有一些代码示例来启动你。 Using the Python posix_ipc library: 使用Python posix_ipc库:

import posix_ipc # POSIX-specific IPC
import mmap      # From Python stdlib

class SharedMemory(object):
    """Python interface to shared memory. 
    The create argument tells the object to create a new SHM object,
    rather than attaching to an existing one.
    """

    def __init__(self, name, size=posix_ipc.PAGE_SIZE, create=True):
        self.name = name
        self.size = size
        if create:
            memory = posix_ipc.SharedMemory(self.name, posix_ipc.O_CREX,
                                            size=self.size)
        else:
            memory = posix_ipc.SharedMemory(self.name)
        self.mapfile = mmap.mmap(memory.fd, memory.size)
        os.close(memory.fd)
        return

    def put(self, item):
        """Put item in shared memory.
        """
        # TODO: Deal with the case where len(item) > size(self.mapfile)
        # TODO: Guard this method with a named semaphore
        self.mapfile.seek(0)
        pickle.dump(item, self.mapfile, protocol=2)
        return

    def get(self):
        """Get a Python object from shared memory.
        """
        # TODO: Deal with the case where len(item) > size(self.mapfile)
        # TODO: Guard this method with a named semaphore
        self.mapfile.seek(0)
        return pickle.load(self.mapfile)

    def __del__(self):
        try:
            self.mapfile.close()
            memory = posix_ipc.SharedMemory(self.name)
            memory.unlink()
        except:
            pass
        return    

For the Java side you want to create the same class, despite what I said in the comments JTux seems to provide the equivalent functionality and the API you need is in UPosixIPC class. 对于Java端,你想创建相同的类,尽管我在评论中说过, JTux似乎提供了相同的功能,你需要的API是在UPosixIPC类中。

The code below is an outline of the sort of thing you need to implement. 下面的代码概述了您需要实现的类型。 However, there are several things missing -- exception handling is the obvious one, also some flags (find them in UConstant ), and you'll want to add in a semaphore to guard the put / get methods. 但是,有几个缺失 - 异常处理是显而易见的,也有一些标志(在UConstant中找到它们),并且你想要添加一个信号量来保护put / get方法。 However, this should set you on the right track. 但是,这应该让你走上正轨。 Remember that an mmap or memory-mapped file is a file-like interface to a segment of RAM. 请记住, mmap或内存映射文件是一段RAM的文件类接口。 So, you can use its file descriptor as if it were the fd of a normal file. 因此,您可以使用其文件描述符,就好像它是普通文件的fd

import jtux.*;

class SHM {

    private String name;
    private int size;
    private long semaphore;
    private long mapfile; // File descriptor for mmap file

    /* Lookup flags and perms in your system docs */
    public SHM(String name, int size, boolean create, int flags, int perms) {
        this.name = name;
        this.size = size;
        int shm;
        if (create) {
            flags = flags | UConstant.O_CREAT;
            shm = UPosixIPC.shm_open(name, flags, UConstant.O_RDWR);
        } else {
            shm = UPosixIPC.shm_open(name, flags, UConstant.O_RDWR);
        }
        this.mapfile = UPosixIPC.mmap(..., this.size, ..., flags, shm, 0);
        return;
    }


    public void put(String item) {
        UFile.lseek(this.mapfile(this.mapfile, 0, 0));
        UFile.write(item.getBytes(), this.mapfile);
        return;
    }


    public String get() {    
        UFile.lseek(this.mapfile(this.mapfile, 0, 0));
        byte[] buffer = new byte[this.size];
        UFile.read(this.mapfile, buffer, buffer.length);
        return new String(buffer);
    }


    public void finalize() {
        UPosix.shm_unlink(this.name);
        UPosix.munmap(this.mapfile, this.size);
    }

}

Some thoughts 一些想法

  • The server is in Java, the client is written in Python. 服务器是Java,客户端是用Python编写的。

An odd combination, but is there any reason one cannot call the other sending via stdin, stdout? 一个奇怪的组合,但是有什么理由不能通过stdin,stdout调用另一个发送?

  • Socket IPC is implemented below: it takes 50 cycles sending 200 bytes ! 套接字IPC在下面实现:它需要50个周期发送200个字节! This has got to be too high. 这必须太高了。 If I send 2 bytes in 5000 cycles, it takes a lot less time. 如果我在5000个周期内发送2个字节,则需要的时间要少得多。

Any call to the OS is going to be relatively slow (latency wise). 对操作系统的任何调用都会相对较慢(延迟明智)。 Using shared memory can by pass the kernel. 使用共享内存可以通过内核。 If throughput is you issue, I have found you can reach 1-2 GB/s using sockets if latency isn't such an issue for you. 如果您遇到吞吐量问题,我发现如果延迟不是您的问题,您可以使用套接字达到1-2 GB / s。

  • Both processes run on one Linux machine. 这两个进程都在一台Linux机器上运行。

Making shared memory ideal. 使共享内存理想化。

  • In the real application about 10 calls to client's iFid.write() are made each cycle. 在实际应用程序中,每个周期都会对客户端的iFid.write()进行大约10次调用。

Not sure why this is the case. 不知道为什么会这样。 Why not build a single structure/buffer and write it once. 为什么不构建单个结构/缓冲区并将其写入一次。 I would use a direct buffer is NIO to minimise latency. 我会使用直接缓冲区是NIO来最小化延迟。 Using character translation is pretty expensive, esp if you only need ASCII. 使用字符转换非常昂贵,特别是如果您只需要ASCII。

  • This is done on a Linux system. 这是在Linux系统上完成的。

Should be easy to optimise. 应该很容易优化。

I use shared memory via memory mapped files. 我通过内存映射文件使用共享内存。 This is because I need to record every message for auditing purposes. 这是因为我需要记录每条消息以进行审计。 I get an average latency of around 180 ns round trip sustained for millions of messages, and about 490 ns in a real application. 对于数百万条消息,我得到的往返平均延迟大约为180 ns,在实际应用中大约为490 ns。

One advantage of this approach is that if there are short delays, the reader can catch up very quickly with the writer. 这种方法的一个优点是,如果有短暂的延迟,读者可以很快赶上作者。 It also support re-start and replication easily. 它还支持轻松重新启动和复制。

This is only implemented in Java, but the principle is simple enough and I am sure it would work in python as well. 这只是用Java实现的,但原理很简单,我相信它也适用于python。

https://github.com/peter-lawrey/Java-Chronicle https://github.com/peter-lawrey/Java-Chronicle

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM