简体繁体 English

多个logstash节点上的Logstash聚合过滤器插件

[英]Logstash aggregate filter plugin on multiple logstash nodes

原文 2020-01-29 16:28:51 0 2 logstash

I'm interested in using the logstash aggregate filter plugin but I was wondering how this would work in my case where I have multiple logstash nodes .我对使用logstash 聚合过滤器插件很感兴趣，但我想知道在我有多个 logstash 节点的情况下这将如何工作。

Also do i need to set - filter workers to 1 ( -w 1 flag) - if I don't mind the events are possessed out of sequence?我还需要设置 - 过滤工人为 1 （ -w 1 标志） - 如果我不介意事件被乱序拥有？

Update更新

My use case demands aggregating logs generated by multiple services by a unique trace-id.我的用例需要通过唯一的跟踪 ID 聚合由多个服务生成的日志。 I have no end event .我没有结束事件。 Rather using a set window of say 3 seconds.而是使用设定的 3 秒窗口。

2 个解决方案

For the aggregate filter to work correctly you need to send the logs with the information you want to combine to the same logstash node.为了使aggregate过滤器正常工作，您需要将包含要合并的信息的日志发送到同一个logstash 节点。

If you are using a shipper, like Filebeat or any other, that sends events to multiple logstash nodes each node will aggregate only the events that it receives, you can't aggregate events from differents logstash nodes.如果您使用像 Filebeat 或其他任何将事件发送到多个 Logstash 节点的传送器，每个节点将仅聚合它接收到的事件，则无法aggregate来自不同 Logstash 节点的事件。

It is also recommended to set the workers to 1 because the main use for the aggregate filter is to get information from the events belonging to the same unique id (task id, job id, process id) and enrich a final event, if you use more than 1 worker, your end event could be processed before your start event, for example.还建议将工作人员设置为 1，因为aggregate过滤器的主要用途是从属于同一唯一 ID（任务 ID、作业 ID、进程 ID）的事件中获取信息并丰富最终事件，如果您使用例如，超过 1 个工作人员，您的end事件可以在start事件之前处理。

I feel this can be tackled by a combination of design and config.我觉得这可以通过设计和配置的组合来解决。 Thought I'd drop in a few thoughts on the topic - for what it's worth.我想我会就这个话题发表一些想法 - 值得一提。

Aggregation is limited to single worker - for obvious reasons.聚合仅限于单个工人 - 原因显而易见。 Which forces your design to restrict aggregation at a single point in your information flow.这迫使您的设计限制信息流中单个点的聚合。 Sounds like a hub/ spoke-wheel pattern to me对我来说听起来像轮毂/辐条轮模式
If you have multiple logstashes - you will need to conditionally route events that DO need aggregation to this central logstash.如果您有多个日志存储 - 您将需要有条件地将需要聚合的事件路由到此中央日志存储。 I'd go with adding a "to-be-aggregated" tags (for example) which can be used for flow control我会添加一个可用于流量控制的“待聚合”标签（例如）
On the aggregator logstash itself - either implicitly aggregate everything that comes in - or use the above tag to decide which one to do or NOT.在聚合器 logstash 本身上 - 要么隐式地聚合进来的所有东西 - 要么使用上面的标签来决定做或不做哪一个。 Obviously, this copy will need to run with pipeline.workers=1 AND pipeline.java_execution=false ( see here ).显然，此副本需要使用 pipeline.workers=1 AND pipeline.java_execution=false 运行（请参阅此处）。
That leaves the rest down to config?剩下的就交给配置了？ Which I think leads you back to the specific filter plugin docs page我认为这会引导您返回特定的过滤器插件文档页面