简体繁体 English

多个进程记录到ZeroMQ Appender

[英]Multiple Processes Logging to ZeroMQ Appender

原文 2018-06-28 23:01:53 8 1 zeromq/ log4j2/ publish-subscribe/ appender/ jeromq

I have multiple processes running on multiple machines using log4j (2.11). 我有多个使用log4j（2.11）在多台计算机上运行的进程。 I need to consolidate the logging messages to be displayed on the front-end, and would like each process to use a ZeroMQ Appender to publish the log messages to a single connection. 我需要合并要显示在前端的日志消息，并且希望每个进程使用ZeroMQ Appender将日志消息发布到单个连接。 I will then have one subscriber receiving the messages, performing the consolidation, and then displaying the log messages. 然后，我将让一个订阅者接收消息，执行合并，然后显示日志消息。

I have a toy application working with one publisher (process logging); 我有一个与一个发布者一起工作的玩具应用程序（过程记录）； however, when multiple processes try to connect to the same endpoint, I receive the "Address already in use" error message. 但是，当多个进程尝试连接到同一端点时，我收到“地址已在使用中”错误消息。 Which (most likely) means the log4j ZeroMQ (JeroMQ) appender is doing the "binding," since only one process can bind a zmq socket. 这（最有可能）意味着log4j ZeroMQ（JeroMQ）附加程序正在执行“绑定”，因为只有一个进程可以绑定zmq套接字。

Is there a configuration option to have the log4j ZeroMQ Appender perform a "connect" instead of the "bind" or is there another option available to accomplish the same goal. 是否可以使用配置选项让log4j ZeroMQ Appender执行“连接”而不是“绑定”，或者是否可以使用另一个选项来实现相同的目标。

1 个解决方案

There's two possible reasons for "hardware" rejected `.bind()` -s “硬件”被拒绝的两个可能原因是`.bind()` s

One has been already in your post -- under an assumption, that the JeroMQ-services are indeed trying to rigidly .bind() in places, where you would like to .connect() instead. 一个掉已经在您的文章-的假设下，即JeroMQ服务确实试图硬性.bind()的地方，在那里你想.connect()来代替。 Review the code-base to either confirm that ( with a possible fork/mod circumventing any such discomfort in a refactored service fashion ) or reject this alternative, if the code is not the one to blame. 查看代码库以确认（如果可能，则fork / mod可能以重构的服务方式避免任何此类不适）或拒绝该替代方法（如果该代码不是应受的代码）。

But let me mention another possible reason , which was quite often a problem during PoC / prototyping. 但是，让我提到另一个可能的原因 ，这在PoC /原型制作中经常是一个问题。

Early-stage mock-up tools are prone to crashes ( and yes, sometimes crash peacefully, but sometime do crash more times or harder than the PoC team is willing to live with ). 早期的模型化工具很容易崩溃（是的，有时会崩溃，但有时崩溃的次数比PoC团队愿意承受的时间更长或更难）。

If the code-base is not ( yet ) well designed during this phase to clearly handle also the resources management ( proper use of explicit aSocket.setsockopt( LINGER, 0 ) _{( a lethal must in native API versions pre-4.x )} {aSocket|aMessage}.close() and Context.term() methods ) that does a clean exit even after a crash, there are situations, that your code has left a still active ( not dismantled ) one or more Context() -instances, which ( still ) block the hardware resources, as it were not allowed to release them yet. 如果在此阶段代码库还没有经过良好的设计，还不能清楚地处理资源管理（正确使用显式aSocket.setsockopt( LINGER, 0 ) _{（在本机API版本4.x之前必须具有致命性））} {aSocket|aMessage}.close()和Context.term()方法）即使在崩溃后仍会干净退出，在某些情况下，您的代码仍处于活动状态（未拆除）一个或多个Context() instances ，它（仍然）阻止了硬件资源，因为尚未允许释放它们。

This has sometimes led to a need to reboot a platform, as a deadlocked Context() -instance has no other way to get "vacuum-cleaned" ( shame on those, who do not fuse/protect the code for surviving till a final stage in try: except: finally: triade, to enforce a graceful termination and clean release of all allocated resource ). 有时这导致需要重新启动平台，因为死锁的Context() instance无法通过其他方式获得“真空清理”（对于那些无法融合/保护代码以便生存到最后阶段的人感到羞耻）在try: except: finally: triade中，强制正常终止并清除所有分配的资源）。