简体   繁体   English


[英]Storm : Spout for reading data from a port

I need to write a storm spout for reading data from a port. 我需要写一个风暴喷口来从端口读取数据。 Wanted to know if that was logically possible. 想知道这在逻辑上是否可行。

With that in mind, I had designed a simple topology designed for the same with one spout and one bolt. 考虑到这一点,我设计了一个简单的拓扑结构,设计用于一个喷嘴和一个螺栓。 The spout would gather HTTP requests sent using wget and the bolt would display the request-Just that. spout会收集使用wget发送的HTTP请求,而bolt会显示请求 - 就是这样。

My spout structure is as follows: 我的喷口结构如下:

public class ProxySpout extends BaseRichSpout{
         //The O/P collector
         SpoutOutputCollector sc;
         //The socket
         Socket clientSocket;
         //The server socket
         ServerSocket sc;

         public ProxySpout(int port){
            this.sc=new ServerSocket(port);
            }catch(IOException ex){
                //Handle it

         public void nextTuple(){
                InputStream ic=clientSocket.getInputStream();
                byte b=new byte[8196];
                int len=ic.read(b);

                sc.emit(new Values(b));
                //Handle it

I have implemented the rest of the methods too. 我也实现了其余的方法。

When I turn this into a topology and run it, I get an error when I send the first request: 当我将其转换为拓扑并运行它时,我发送第一个请求时出错:

java.lang.RuntimeException:java.io.NotSerializableException:java.net.Socket 了java.lang.RuntimeException:java.io.NotSerializableException:java.net.Socket中

Just need to know if there is something wrong with the way I am implementing this spout. 只需要知道我实施这个喷口的方式是否有问题。 Is it even possible for a spout to collect data from a port? 鲸鱼喷水器甚至可以从端口收集数据吗? Or for a spout to act as an instance of a proxy? 或者鲸鱼喷水充当代理的实例?

Edit 编辑

Got it working. 搞定了。

The code is: 代码是:

   public class ProxySpout extends BaseRichSpout{
         //The O/P collector
         static SpoutOutputCollector _collector;
         //The socket
         static Socket _clientSocket;
         static ServerSocket _serverSocket;
         static int _port;

         public ProxySpout(int port){

         public void open(Map conf,TopologyContext context, SpoutOutputCollector collector){
           _serverSocket=new ServerSocket(_port);

         public void nextTuple(){
            InputStream incomingIS=_clientSocket.getInputStream();
            byte[] b=new byte[8196];
            int len=b.incomingIS.read(b);
            _collector.emit(new Values(b));

As per @Shaw's suggestion, tried initializing _serverSocket in the open() method and the _clientSocket runs in nextTuple() method for listening to requests. 根据@ Shaw的建议,尝试在open()方法中初始化_serverSocket_clientSocketnextTuple()方法中运行以侦听请求。

Dunno the performance metrices of this one, but it works..:-) 不知道这个的表现形式,但它有效.. :-)

In constructor just assign the variables. 在构造函数中只需分配变量。 Try to instantiate ServerSocket in prepare method, do not write any new ... in constructor. 尝试在prepare方法中实例化ServerSocket,不要在构造函数中编写任何新的... And rename variables, you have two sc variables. 并重命名变量,你有两个sc变量。

public class ProxySpout extends BaseRichSpout{

    int port;

    public ProxySpout(int port){

    public void open(Map conf, TopologyContext context, SpoutOutputCollector collector)  { 
        //new ServerSocket

    public void nextTuple() {


    public void declareOutputFields(OutputFieldsDeclarer declarer) {


If you put it in prepare method then it will only be called once the spout is already deployed, so it doesn't need to be serialized, and it will only be called once per lifetime of the spout, so it's not inefficient. 如果你把它放在prepare方法中,那么只有在已经部署了spout之后才会调用它,所以它不需要被序列化,并且它只会在每个spout生命周期被调用一次,所以它效率不高。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

粤ICP备18138465号  © 2020-2024 STACKOOM.COM