简体   繁体   中英

How would you grab the latest message from multiple connections to a single ZMQ socket?

I am new to ZMQ and am not sure if what I want is even possible or if I should use another technology.

I would like to have a socket that multiple servers can stream to.

It appears that a ZMQ socket can do this based on this documentation: http://api.zeromq.org/4-0:zmq-setsockopt

How would I implement a ZMQ socket on the receiving end that only grabs the latest message sent from each server?

You can do this with Zmq's PUB / SUB.

The first key thing is that a SUB socket can be connected to multiple PUBlishers. This is covered in Chapter 1 of the guide:

Some points about the publish-subscribe (pub-sub) pattern:

  • A subscriber can connect to more than one publisher, using one connect call each time. Data will then arrive and be interleaved “fair-queued” so that no single publisher drowns out the others.

  • If a publisher has no connected subscribers, then it will simply drop all messages.

  • If you're using TCP and a subscriber is slow, messages will queue up on the publisher. We'll look at how to protect publishers against this using the “high-water mark” later.

So, that means that you can have a single SUB socket on your client. This can be connected to several PUB sockets, one for each server from which the client needs to stream messages.

Latest Message

The "latest message" can be partially dealt with (as I suspect you'd started to find) using high water marks. The ZMQ_RCVHWM option allows the number to be received to be set to 1, though this is an imprecise control.

You also have to consider what it is that is meant by "latest" message; the PUB servers and SUB client will have different views of what this is. For example, when the zmq_send() function on a PUB server returns, the sent message is the one that the PUBlisher would regard as the "latest".

However, over in the client there is no knowledge of this as nothing has yet got down through the PUBlishing server's operating system network stack, nothing has yet touched the Ethernet, etc. So the SUBscribing client's view of the "latest" message at that point in time is whichever message is in ZMQ's internal buffers / queues waiting for the application to read it. This message could be quite old in comparison to the one the PUBlisher has just started sending.

In reality, the "latest" message seen by the client SUBscriber will be dependent on how fast the SUBscriber application runs.

  • Provided it's fast enough to keep up with all the PUBlishers, then every single message the SUBscriber gets will be as close to the "latest" message as it can get (the message will be only as old as the network propagation delays and the time taken to transit through ZMQ's internal protocols, buffers and queues).

  • If the SUBscriber isn't fast enough to keep up, then the "latest" messages it will see will be at least as old as the processing time per message multiplied by the number of PUBlishers. If you've set the receive HWM to 1, and the subscriber is not keeping up, the publishers will try publishing messages but the subscriber socket will keep rejecting them until the subscribed application has cleared out the old message that's caused the queue congestion, waiting for zmq_recv() to be called.

If the subscriber can't keep up, the best thing to do in the subscriber is:

  • have a receiving thread dedicated to receiving messages and dispose of them until processing becomes available

  • have a separate processing thread that does the processing.

  • Have the two threads communicate via ZMQ, using a REQ / REP pattern via an inproc connection.

  • The receiving thread can zmq_poll both the SUB socket connection to the PUBlishing servers and the REP socket connection to the processing thread.

  • If the receiving thread receives a message on the REP socket, it can reply with the next message read from the SUB socket.

  • If it receives a message from the SUB socket with no REPly due, it disposes of the message.

  • The processing thread sends 1 bytes messages (the content doesn't matter) to its REQ socket to request the latest message, and receives the latest message from the PUBlishers in reply.

Or, something like that. That'll keep the messages flowing from PUBlishers to the SUBscriber, thus the SUBscriber always has a message as close to possible as being "the latest" and is processing that as and when it can, disposing of messages it can't deal with.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM