简体   繁体   English

Apache Storm spout可以相互通信吗?

[英]Can Apache Storm spouts communicate with each other?

I have a directory which another process throws files into. 我有一个目录,另一个进程将文件引入。

Our current implementation of Storm reads this directory and selects the oldest file and opens a reader to the file. 我们当前的Storm实现读取此目录并选择最旧的文件并打开该文件的阅读器。 This reader is held as a field within the spout so when nextTuple() is called, a single line is output from the file. 此读取器作为spout中的字段保存,因此当调用nextTuple()时,将从文件输出一行。 Once the spout has finished reading it closes the reader and opens a new reader to a new file. 一旦喷口完成读取,它就会关闭阅读器并打开一个新的阅读器到一个新文件。

To increase the throughput an idea was to have multiple spouts reading multiple files at once, as these spouts will be fighting over the same files in the same directory, is there a way to communicate between spouts so they can negotiate on which files to read? 为了提高吞吐量,一个想法是让多个spout同时读取多个文件,因为这些spout将在同一目录中的同一文件上进行争用,是否有一种方法可以在spouts之间进行通信,以便他们可以协商读取哪些文件? (Or have an overall manager which allocates files to spouts). (或者有一个将文件分配给spouts的总经理)。

The directory and files are stored and read from HDFS. 目录和文件存储在HDFS中并从中读取。

I think out of the box there is no way to make two spout communicate together. 我认为开箱即用,没有办法让两个喷嘴一起沟通。 However, you should try https://github.com/ptgoetz/storm-signals 但是,您应该尝试https://github.com/ptgoetz/storm-signals

There is a BaseSignalSpout that relies on zookeeper to send messages between storm components. BaseSignalSpout依赖于zookeeper在storm组件之间发送消息。

Hope this help! 希望这有帮助!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM