简体   繁体   中英

How to integrate a python bolt to a java topology for Apache Storm?

I was trying to integrate a simple python bolt to an already configured storm topology created using Apache Storm and Storm Crawler SDK. I was following the instructions provided here

But I am constantly getting error:

java.lang.Exception: Shell Process Exception: Traceback (most recent call last):
  File "D:\<PATH>\storm.py", line 217, in run
    tup = readTuple()
  File "D:\<PATH>\storm.py", line 74, in readTuple
    cmd = readCommand()
  File "D:\<PATH>\storm.py", line 67, in readCommand
    msg = readMsg()
  File "D:\<PATH>\storm.py", line 42, in readMsg
    return json_decode(msg[0:-1])
  File "D:\<PATH>\storm.py", line 30, in <lambda>
    json_decode = lambda x: json.loads(x)
  File "C:\Users\akumar\AppData\Local\Continuum\Anaconda2\lib\json\__init__.py", line 339, in loads
    return _default_decoder.decode(s)
  File "C:\Users\akumar\AppData\Local\Continuum\Anaconda2\lib\json\decoder.py", line 364, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "C:\Users\akumar\AppData\Local\Continuum\Anaconda2\lib\json\decoder.py", line 382, in raw_decode
    raise ValueError("No JSON object could be decoded")
ValueError: No JSON object could be decoded

    at org.apache.storm.task.ShellBolt.handleError(ShellBolt.java:227) [storm-core-1.2.1.jar:1.2.1]
    at org.apache.storm.task.ShellBolt.access$1100(ShellBolt.java:72) [storm-core-1.2.1.jar:1.2.1]
    at org.apache.storm.task.ShellBolt$BoltReaderRunnable.run(ShellBolt.java:348) [storm-core-1.2.1.jar:1.2.1]
    at java.lang.Thread.run(Unknown Source) [?:1.8.0_171]
96398 [Thread-40] ERROR o.a.s.t.ShellBolt - Halting process: ShellBolt died. Command: [python, D:/<PATH>/ClassifyBolt.py], ProcessInfo pid:12708, name:classify exitCode:0, errorString: 
java.lang.RuntimeException: org.apache.storm.multilang.NoOutputException: Pipe to subprocess seems to be broken! No output read.
Serializer Exception:


    at org.apache.storm.utils.ShellProcess.readShellMsg(ShellProcess.java:127) ~[storm-core-1.2.1.jar:1.2.1]
    at org.apache.storm.task.ShellBolt$BoltReaderRunnable.run(ShellBolt.java:330) [storm-core-1.2.1.jar:1.2.1]
    at java.lang.Thread.run(Unknown Source) [?:1.8.0_171]
96399 [Thread-40] ERROR o.a.s.d.executor - 
java.lang.RuntimeException: org.apache.storm.multilang.NoOutputException: Pipe to subprocess seems to be broken! No output read.
Serializer Exception:


    at org.apache.storm.utils.ShellProcess.readShellMsg(ShellProcess.java:127) ~[storm-core-1.2.1.jar:1.2.1]
    at org.apache.storm.task.ShellBolt$BoltReaderRunnable.run(ShellBolt.java:330) [storm-core-1.2.1.jar:1.2.1]
    at java.lang.Thread.run(Unknown Source) [?:1.8.0_171]

I was trying to add the created bolt within the sample crawler example provider by the storm-crawler website. In my view, it seems to look like python bolt is not getting the stream from its previous component in the topology.

Can anyone help?

This is probably due to your bolt's python process printing out some log output to stdout. By default, multilang Storm bolts use the process' stdin/stdout for communication with the Java process.

Using pystorm as your Python bolt infrastructure would allow you to use different streams for this communication, such as UNIX named pipes or sockets, and free up stdout for the usual logging.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM