如何在子服务器套接字关闭时制作 ZMQ pub 客户端套接字缓冲区消息

Question

Given 2 applications where application A is using a publisher client to contentiously stream data to application B which has a sub server socket to accept that data, how can we configure pub client socket in application A such that when B is being unavailable (like its being redeployed, restarted) A buffers all the pending messages and when B becomes available buffered messages go trough and socket catches up with real time stream?给定 2 个应用程序，其中应用程序 A 使用发布者客户端有争议地将 stream 数据发送到应用程序 B，应用程序 B 有一个子服务器套接字来接受该数据，我们如何在应用程序 A 中配置发布客户端套接字，以便当 B 不可用时（就像它是重新部署，重新启动）A 缓冲所有未决消息，当 B 变为可用时缓冲消息 go 槽和套接字赶上实时 stream？

In a nutshell, how do we make PUB CLIENT socket buffer messages with some limit while SUB SERVER is unavailable?简而言之，当 SUB SERVER 不可用时，我们如何使 PUB CLIENT 套接字缓冲消息具有一定的限制？

The default behaviour for PUB client is to drop in mute state, but it would be great if we could change that to a limit sized buffer, is it possible with zmq? PUB 客户端的默认行为是放入静音 state，但如果我们可以将其更改为限制大小的缓冲区，那就太好了，zmq 可以吗？ or do i need to do it on application level...还是我需要在应用程序级别进行...

I've tried setting HWM and LINGER in my sockets, but if i'm not wrong they are only responsible for slow consumer case, where my publisher is connected to subscriber, but subscriber is so slow that publisher starts to buffer messages (hwm will limit number of those messages)...我已经尝试在我的 sockets 中设置 HWM 和 LINGER，但如果我没记错的话，他们只负责慢消费者情况，我的发布者连接到订阅者，但订阅者太慢以至于发布者开始缓冲消息（hwm 将限制这些消息的数量）...

I'm using jeromq since i'm targeting jvm platform.我正在使用jeromq ，因为我的目标是 jvm 平台。

Answer 1

_{First of all, welcome to the world of Zen-of-Zero, where latency matters most}_{首先，欢迎来到零之禅的世界，延迟最重要}

PROLOGUE:序幕：

ZeroMQ was designed by a Pieter HINTJENS' team of ultimately experienced masters - Martin SUSTRIK to be named first. ZeroMQ 是由 Pieter HINTJENS 的最终经验丰富的大师团队设计的——Martin SUSTRIK 被命名为第一名。 The design was professionally crafted so as to avoid any unnecessary latency.该设计经过专业设计，以避免任何不必要的延迟。 So asking about having a (limited) persistence?那么询问是否具有（有限的）持久性？ No, sir, not confirmed - PUB/SUB Scalable Formal Communication Pattern Archetype will not have it built-in, right because of the added problems and decreased performance and scalability ( add-on latency, add-on processing, add-on memory-management ).不，先生，未确认 - PUB/SUB Scalable Formal Communication Pattern Archetype 不会内置它，因为增加了问题以及降低了性能和可扩展性（附加延迟、附加处理、附加内存-管理）。

If one needs a (limited) persistence (for absent remote-SUB-side agent(s)' connections ), feel free to implement it on the app-side, or one may design and implement a new ZMTP-compliant such behaviour-pattern Archetype, extending the ZeroMQ framework, if such work goes into stable and publicly accepted state, but do not request the high-performance, latency-shaved standard PUB/SUB having polished the almost linear scalability ad astra, to get modified in this direction.如果需要（有限的）持久性（对于缺少远程 SUB 端代理的连接），请随时在应用程序端实现它，或者可以设计和实现一种新的 ZMTP 兼容的此类行为模式Archetype，扩展 ZeroMQ 框架，如果这样的工作进入稳定且被公众接受的 state，但不要要求高性能、延迟削减的标准PUB/SUB已经完善了几乎线性的可扩展性广告，以便在这个方向上进行修改。 It is definitely not a way to go.这绝对不是 go 的方法。

Solution?解决方案？

App-side may easily implement your added logic, using dual-pointer circular buffers, working in a sort-of (app-side-managed) - Persistence-PROXY , yet in-front-of the PUB -sender.应用程序端可以轻松实现您添加的逻辑，使用双指针循环缓冲区，在某种（应用程序端管理的） - Persistence-PROXY中工作，但在PUB发送器之前。

Your design may get successful in squeezing some additional sauce from the ZeroMQ internal details in case your design also enjoys to use the recently made available built-in ZeroMQ- socket_monitor -component to setup an additional control-layer and receive there a stream of events as seen from "inside" the PUB-side Context -instance, where some additional network and connection-management related events may bring more light into your (app-side-managed) - Persistence-PROXY如果您的设计还喜欢使用最近提供的内置 ZeroMQ- socket_monitor来设置额外的控制层并在那里接收stream 事件，那么您的设计可能会成功地从 ZeroMQ 内部细节中榨取一些额外的调味料从 PUB 端Context实例“内部”看到，其中一些额外的网络和连接管理相关事件可能会为您的（应用端管理的）带来更多亮点 - Persistence-PROXY

Yet, be warned that然而，请注意

The _zmq_socket_monitor()_ method supports only connection-oriented transports, that is, TCP, IPC, and TIPC. _zmq_socket_monitor()_方法仅支持面向连接的传输，即 TCP、IPC 和 TIPC。

so one may straight forget about this in case any of the ultimately interesting transport-classes was planned to be used { inproc:// | norm:// | pgm:// | epgm:// | vmci:// }所以人们可能会直接忘记这一点，以防计划使用任何最终有趣的传输类{ inproc:// | norm:// | pgm:// | epgm:// | vmci:// } { inproc:// | norm:// | pgm:// | epgm:// | vmci:// }

Heads up !小心！

There are inaccurate, if not wrong, pieces of information from our Community honorable member smac89 , who tried his best to address your additional interest expressed in the comment:我们的社区尊敬的成员smac89提供了不准确的信息（如果没有错的话），他已尽力解决您在评论中表达的额外兴趣：

"...zmq optimizes publishing on topics? like if you keep publishing on some 100char long topic rapidly, is it actually sending the topic every time or it maps to some int and sends the int subsequently...?" “...zmq 优化了主题的发布？例如，如果您继续快速发布一些 100char 长的topic ，它实际上是每次都发送topic还是映射到某个 int 并随后发送 int ......？”

telling you:告诉你：

"It will always publish the topic. When I use the pub-sub pattern, I usually publish the topic first and then the actual message, so in the subscriber I just read the first frame and ignore it and then read the actual message" “它总是会发布topic.当我使用pub-sub模式时，我通常会先发布topic ，然后再发布实际消息，因此在订阅者中我只是读取第一帧并忽略它，然后再读取实际消息”

_{ZeroMQ does not work this way.} _{ZeroMQ 不能以这种方式工作。} _{There is nothing as a "separate" <topic> followed by a <message-body> , but rather the opposite}_{没有什么是“单独的” <topic>后跟<message-body> ，而是相反}

The TOPIC and the mechanisation of topic-filtering works in a very different way. TOPIC和主题过滤的机械化以非常不同的方式工作。

1) you never know, who .connect() -s: 1）你永远不知道，谁.connect() -s：
ie one can be almost sure the version 2.x till version 4.2+ will handle the topic-filtering in different manner ( ZMTP:RFC defines intial capability-version handshaking, to let the Context -instance decide, which version of topic-filtering will have to be used:即几乎可以肯定版本 2.x 直到版本 4.2+ 将以不同的方式处理主题过滤（ ZMTP:RFC 定义了初始能力版本握手，让Context实例决定，哪个版本的主题过滤将必须使用：
ver 2.x used to move all messages to all peers, and let all the SUB-sides ( of ver 2.x+ ) be delivered the message ( and let the SUB -side Context -instance process the local topic -list filter processing ) ver 2.x用于将所有消息移动到所有对等方，并让所有 SUB 端（ver 2.x+ 的）传递消息（并让SUB端Context -instance 处理本地topic -list 过滤处理）

whereas然而
ver 4.2+ are sure to perform the topic -list filter processing on **the PUB-side Context -instance (CPU-usage grows, network-transport the opposite ), so your SUB-side will never be delivered a byte of "useless" _{read "not-subscribed" to} messages. 4.2+ 版肯定会在 ** PUB 端Context实例上执行topic列表过滤处理（CPU 使用率增长，网络传输相反），因此您的 SUB 端永远不会被传递一个“无用的字节” "_{阅读“未订阅”}消息。

2) (you may, but) there is no need to separate a "topic" into a first-frame of a thus-implied multi-frame message. 2）（你可以，但是）没有必要将“主题”分离成这样隐含的多帧消息的第一帧。 Perhaps just the opposite ( it is a rather anti-pattern to do this in high performance, low-latecy distributed system design.也许恰恰相反（在高性能、低延迟的分布式系统设计中这样做是一种相当反模式。

Topic filtering process is defined and works byte-wise , from left-to-right, pattern matching for each of the topic-list member value agains the delivered message payload.主题过滤过程被定义并按字节方式工作，从左到右，将每个主题列表成员值与传递的消息有效负载进行模式匹配。

Adding extra data, extra frame-management processing just and only does increase the end-to-end latency and processing overhead.添加额外的数据、额外的帧管理处理只会增加端到端延迟和处理开销。 Never a good idea to do this instead of proper distributed-system design work.这样做不是一个好主意，而不是适当的分布式系统设计工作。

EPILOGUE:结语：

There are no easy wins nor any low-hanging fruit in professional distributed-systems design, the less if low-latency or ultra-low-latency are the design targets.专业的分布式系统设计没有容易的胜利，也没有任何容易实现的成果，低延迟或超低延迟是设计目标。

On the other hand, be sure that ZeroMQ framework was made with this in mind and these efforts were crowned with stable, ultimately performant well-balanced set of tools for smart (by design), fast (in operation) and scalable (as hell may envy) signaling/messaging services people love to use right because of this design wisdom.另一方面，确保 ZeroMQ 框架是考虑到这一点的，这些努力以稳定、最终性能良好的平衡工具集为基础，用于智能（设计）、快速（运行中）和可扩展（如地狱可能羡慕）由于这种设计智慧，人们喜欢正确使用的信号/消息服务。

Wish you live happy with ZeroMQ as it is and feel free to add any additional set of features "in front" of the ZeroMQ layer, inside your application suite of choice.希望您对 ZeroMQ 感到满意，并随时在您选择的应用程序套件中的 ZeroMQ 层“前面”添加任何额外的功能集。

Answer 2

As we've discussed in the comments there is no way for the publisher to buffer messages while having nothing connected to it, it will simply drop any new messages:正如我们在评论中讨论的那样，发布者无法在没有任何连接的情况下缓冲消息，它只会丢弃任何新消息：

From the docs:从文档：

If a publisher has no connected subscribers, then it will simply drop all messages.如果发布者没有连接的订阅者，那么它将简单地丢弃所有消息。

This means your buffer needs to be outside of zeromq's care .这意味着您的缓冲区需要不在 zeromq 的关注范围内。 Your buffer could then be a list, or a database, or any other method of storage you choose, but you cannot use your publisher for doing that.然后，您的缓冲区可以是列表、数据库或您选择的任何其他存储方法，但您不能使用您的发布者来执行此操作。

Now the next problem is dealing with how to detect that a subscriber has connected/disconnected.现在下一个问题是处理如何检测订户已连接/断开连接。 This is needed to tell us when we need to start reading from the buffer/filling the buffer.这需要告诉我们何时需要开始从缓冲区读取/填充缓冲区。

I suggest using Socket.monitor and listening for the ZMQ_EVENT_CONNECTED and ZMQ_EVENT_DISCONNECTED , as these will tell you when a client has connected/disconnected and thus enable you to switching to filling your buffer of choice.我建议使用Socket.monitor并监听ZMQ_EVENT_CONNECTED和ZMQ_EVENT_DISCONNECTED ，因为它们会告诉您客户端何时连接/断开连接，从而使您能够切换到填充您选择的缓冲区。 Of course, there might be other ways of doing this that does not directly involve zeromq, but that's up to you to decide.当然，可能还有其他不直接涉及 zeromq 的方法，但这由您决定。

Answer 3

I'm posting a quick update since the other two answers (though very informative were actually wrong), and i dont want others to be misinformed from my accepted answer.我发布了一个快速更新，因为其他两个答案（尽管信息量很大实际上是错误的），我不希望其他人从我接受的答案中被误导。 Not only you can do this with zmq, it is actually the default behaviour .不仅你可以用 zmq 做到这一点，它实际上是默认行为。

The trick is that if you publisher client never connected to the subscriber server before it keeps dropping messages (and that is why i was thinking it does not buffer messages), but if your publisher connects to subscriber and you restart subscriber, publisher will buffer messages until HWM is reached which is exactly what i asked for... so in short publisher wants to know there is someone on the other end accepting messages only after that it will buffer messages...诀窍是，如果您的发布者客户端在不断丢弃消息之前从未连接到订阅者服务器（这就是我认为它不缓冲消息的原因），但是如果您的发布者连接到订阅者并且您重新启动订阅者，则发布者将缓冲消息直到达到 HWM，这正是我所要求的......所以简而言之，发布者想知道另一端有人在接受消息之后才会缓冲消息......

Here is some sample code which demonstrates this (you might need to do some basic edits to compile it).这是一些示例代码，演示了这一点（您可能需要进行一些基本的编辑来编译它）。

I used this dependency only org.zeromq:jeromq:0.5.1 .我只使用了这个依赖org.zeromq:jeromq:0.5.1 。

zmq-publisher.kt

fun main() {
   val uri = "tcp://localhost:3006"
   val context = ZContext(1)
   val socket = context.createSocket(SocketType.PUB)

   socket.hwm = 10000
   socket.linger = 0
   "connecting to $uri".log()
   socket.connect(uri)

   fun publish(path: String, msg: Msg) {
      ">> $path | ${msg.json()}".log()
      socket.sendMore(path)
      socket.send(msg.toByteArray())
   }

   var count = 0

   while (notInterrupted()) {
      val msg = telegramMessage("message : ${++count}")
      publish("/some/feed", msg)
      println()

      sleepInterruptible(1.second)
   }
}

and of course zmq-subscriber.kt当然还有zmq-subscriber.kt


fun main() {
   val uri = "tcp://localhost:3006"
   val context = ZContext(1)
   val socket = context.createSocket(SocketType.SUB)

   socket.hwm = 10000
   socket.receiveTimeOut = 250

   "connecting to $uri".log()
   socket.bind(uri)

   socket.subscribe("/some/feed")

   while (true) {
      val path = socket.recvStr() ?: continue
      val bytes = socket.recv()
      val msg = Msg.parseFrom(bytes)
      "<< $path | ${msg.json()}".log()
   }
}

Try running publisher first without subscriber, then when you launch subscriber you missed all the messages so far... now without restarting publisher, stop subscriber wait for some time and start it again.尝试在没有订阅者的情况下先运行发布者，然后当您启动订阅者时，您错过了到目前为止的所有消息......现在无需重新启动发布者，停止订阅者等待一段时间并重新启动它。

Here is an example of one of my services actually benefiting from this... This is the structure [current service]sub:server <= pub:client[service being restarted]sub:server <=* pub:client[multiple publishers]这是我的一项服务实际上从中受益的示例...这是结构[current service]sub:server <= pub:client[service being restarted]sub:server <=* pub:client[multiple publishers]

Because i restart the service in the middle, all the publishers start buffering their messages, the final service that was observing ~200 messages per second observes drop to 0 (those 1 or 2 are heartbeats) then sudden burst of 1000+ messages come in, because all publishers flushed their buffers (restart took about 5 seconds)... I am actually not loosing a single message here...因为我在中间重新启动了服务，所以所有发布者都开始缓冲他们的消息，观察到每秒约 200 条消息的最终服务观察到下降到 0（那些 1 或 2 是心跳）然后突然爆发 1000 多条消息，因为所有发布者都刷新了他们的缓冲区（重新启动大约需要 5 秒）......我实际上并没有在这里丢失一条消息......

Note that you must have subscriber:server <= publisher:client pair (this way publisher knows "there is only one place i need to deliver these messages to" (you can try binding on publisher and connecting on subscriber but you will not see publisher buffering messages anymore simply because its questionable if subscriber that just disconnected did it because it no longer needs the data or because it failed)请注意，您必须有subscriber:server <= publisher:client对（这样发布者知道“我只需要将这些消息传递到一个地方”（您可以尝试绑定发布者并连接订阅者，但您不会看到发布者不再缓冲消息只是因为它是有问题的，如果刚刚断开连接的订阅者这样做是因为它不再需要数据或因为它失败了）

如何在子服务器套接字关闭时制作 ZMQ pub 客户端套接字缓冲区消息

问题描述

3 个解决方案

解决方案1
3 2019-10-18 12:48:54

PROLOGUE:序幕：

Solution?解决方案？

Heads up !小心！

EPILOGUE:结语：

解决方案2
2 2019-10-18 05:33:52

解决方案3
2 已采纳 2019-11-03 07:46:33

如何在子服务器套接字关闭时制作 ZMQ pub 客户端套接字缓冲区消息

问题描述

3 个解决方案

解决方案1 3 2019-10-18 12:48:54

PROLOGUE:序幕：

Solution?解决方案？

Heads up !小心 ！

EPILOGUE:结语：

解决方案2 2 2019-10-18 05:33:52

解决方案3 2 已采纳 2019-11-03 07:46:33

解决方案1
3 2019-10-18 12:48:54

Heads up !小心！

解决方案2
2 2019-10-18 05:33:52

解决方案3
2 已采纳 2019-11-03 07:46:33