简体   繁体   中英

Apache Storm: Track tuples by unique ID from Source Spout to Final Bolt

I want a method of uniquely identifying tuples throughout a whole Storm topology, so that each tuple can be tracked from Spout to the final Bolt.

The way I understand it is when passing a unique message id with an emit from a spout for example:

String msgID = UUID.randomUUID();
// emits a line from user tasks with msg id
outputCollector.emit(new Values(task), msgID);

This ID is somehow returned when acked to the Spout (Can this be simulated earlier to get back the passed Id at any point?). But the using of get message id on a tuple for example:

inputTuple.getMessageId()

This returns a new messageId not the one passed in at the Spout that is generated by the Tuple. Reference https://groups.google.com/forum/#!topic/storm-user/xBEqMDa-RZs

Questions

1) Is there a way to get the tuple.getMessageId() when the collector emits the Tuple.

2) Alternatively can the passed in messageId at the spout be got somehow from the tuple at any spout or bolt in the toplogy?

End Solution I want to be able to set an ID on a tuple when it is emitted, and then be able to identify that tuple again at any point in the Storm topology.

Or will the unique messageId that my system will track with have to be passed as a field/value on each output of each spout and bolt.

Thanks

无法在生产者处访问系统生成的ID(仅通过tuple.getMessageId()在消费者tuple.getMessageId() 。为了按照您的意愿跟踪元组,您需要(按照您自己的想法)添加ID作为元组的常规字段值,并将其在每个螺栓中复制到相应的输出元组。

Several parts to this answer. First, as you correctly point out, it's up to you to come up with a unique ID in your spout for each tuple you emit. Second, if you want to access that ID anywhere in your topology then add that ID to the composite Tuple emitted by the Spout. Third (just for completeness), if there's anything in your emitted tuple that you'll need to know when handling an ack or a fail in your Spout then add that information as part of a composite value that makes up your message ID.

Just as an example, I usually use the Tuple itself as the message ID, too, when emitting a tuple from a spout:

outputCollector.emit(myTuple, myTuple);

This might be overkill, but at least I have access to all of the information in the tuple everywhere.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM