简体   繁体   English

Apache Flume-由多个使用者从单个消息队列中提取数据

[英]Apache Flume - Ingesting data from single message queue by multiple consumers

I am currently developing Apache Flume agent that can ingest data from single message queue (Solace). 我目前正在开发Apache Flume代理,该代理可以从单个消息队列(Solace)中提取数据。 Since message processing is slow due to the size and there will be a lot of messages to ingest, I am thinking of having multiple agents to consume them. 由于消息的大小导致消息处理速度很慢,并且将吸收很多消息,因此我考虑使用多个代理来使用它们。 However, the challenge will be that multiple agents might take same message resulting the duplicates in the sink (landing bucket). 但是,挑战在于多个代理可能会接收同一条消息,从而导致接收器(着陆桶)中出现重复项。 While one agent is processing a message (not acknowledged), if another agent takes a message from a queue then this could happen. 当一个代理正在处理一条消息(未确认)时,如果另一个代理从队列中接收一条消息,则可能会发生这种情况。 Please share if you have similar experience and ideas to solve this issue. 如果您有类似的经验和想法可以解决此问题,请分享。 Thanks. 谢谢。

You can use a non-exclusive queue to distribute messages between your multiple agents(consumers) in a round-robin fashion. 您可以使用非专有队列以循环方式在多个代理(消费者)之间分发消息。

There won't be any duplicates unless there's some underlying error(such as one of the consumers disconnecting), causing the Solace message broker to re-deliver delivered, but unacknowledged messages to another consumer. 除非存在一些潜在的错误(例如,其中一个使用者断开连接),否则不会有任何重复,从而导致Solace消息代理重新传递已传递但未确认的消息给另一个使用者。

In this case, the JMS message would be marked as re-delivered and your application has to perform some logic to handle this possibly duplicate message. 在这种情况下,JMS消息将被标记为已重新发送,并且您的应用程序必须执行一些逻辑来处理此可能重复的消息。


While not strictly necessary, it's also probably a good idea to think about handling "poison" messages that cannot be successfully handled by the application, and could potentially cause a redeliver-fail,redeliver-fail,redeliver-fail loop. 尽管不是严格必需的,但考虑处理应用程序无法成功处理的“毒药”消息也是一个好主意,它可能导致重新传递失败,重新传递失败,重新传递失败循环。

You can do this by ensuring that your messages are dead-message-queue eligible, configure a dead-message-queue and adjusting the "Max Redelivery" setting on the queue. 为此,请确保您的消息符合死消息队列的条件,配置死消息队列,并调整队列上的“最大重新发送”设置。

This would ensure that "poison" messages would be eventually be moved to the dead-message-queue instead of continuously being redelivered to your application. 这将确保“有毒”消息最终将被移动到无效消息队列,而不是连续地重新传递到您的应用程序。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM