繁体   English   中英

使用Python将消息发送到Flume Avro Source

[英]Using Python to send message to Flume Avro Source

我想编写一个Python程序,将JSON文档流转换为Avro并将其流传输到Flume(这样我就可以将它们发送到Solr和Parquet)。

我正在看一个使用Python avro库的示例,该库声称实现了avro rpc协议。 https://github.com/phunt/avro-rpc-quickstart/blob/master/src/main/python/send_message.py

但是,当我尝试将示例发送到Flume Avro服务器时,似乎只是关闭了连接。 例如

$ ./atest.py jnkjn kjnkjn e3e3
Have requester
About to request... REQUEST>Ú­òs±3ô8RÍsÊT¿ÌQÚ­òs±3ô8RÍsÊT¿Ìsend
jnkjn
     kjnkje3e3<
RESPONSE><
Traceback (most recent call last):
  File "atest.py", line 35, in <module>
    print("Result: " + requestor.request('send', params))
  ...
  File "/usr/lib64/python2.6/httplib.py", line 991, in getresponse
    response.begin()
  File "/usr/lib64/python2.6/httplib.py", line 392, in begin
    version, status, reason = self._read_status()
  File "/usr/lib64/python2.6/httplib.py", line 356, in _read_status
    raise BadStatusLine(line)
httplib.BadStatusLine

我必须在HTTPLIB中放置一个XXX打印语句,以查看响应只是空的,或者连接已关闭。

通过将avro Python库指向Flume Avro Source,我是否还在写轨道上? 他们甚至使用相同的协议吗?

我正在运行最新的CDH5.1堆栈。

在检查Flume日志时,我注意到每次尝试连接时都会抛出一个非常具体的错误:

2014-09-16 16:35:15,745 INFO org.apache.avro.ipc.NettyServer: [id: 0x0633c6d1, /192.168.150.84:38516 => /172.31.1.204:19999] OPEN
2014-09-16 16:35:15,745 INFO org.apache.avro.ipc.NettyServer: [id: 0x0633c6d1, /192.168.150.84:38516 => /172.31.1.204:19999] BOUND: /172.31.1.204:19999
2014-09-16 16:35:15,746 INFO org.apache.avro.ipc.NettyServer: [id: 0x0633c6d1, /192.168.150.84:38516 => /172.31.1.204:19999] CONNECTED: /192.168.150.84:38516
2014-09-16 16:35:15,747 INFO org.apache.avro.ipc.NettyServer: [id: 0x0633c6d1, /192.168.150.84:38516 :> /172.31.1.204:19999] DISCONNECTED
2014-09-16 16:35:15,747 INFO org.apache.avro.ipc.NettyServer: [id: 0x0633c6d1, /192.168.150.84:38516 :> /172.31.1.204:19999] UNBOUND
2014-09-16 16:35:15,747 INFO org.apache.avro.ipc.NettyServer: [id: 0x0633c6d1, /192.168.150.84:38516 :> /172.31.1.204:19999] CLOSED
2014-09-16 16:35:15,747 INFO org.apache.avro.ipc.NettyServer: Connection to /192.168.150.84:38516 disconnected.
2014-09-16 16:35:15,747 WARN org.apache.avro.ipc.NettyServer: Unexpected exception from downstream.
org.apache.avro.AvroRuntimeException: Excessively large list allocation request detected: 539959368 items! Connection closed.
    at org.apache.avro.ipc.NettyTransportCodec$NettyFrameDecoder.decodePackHeader(NettyTransportCodec.java:167)
    at org.apache.avro.ipc.NettyTransportCodec$NettyFrameDecoder.decode(NettyTransportCodec.java:139)
    at org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:422)
    at org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303)
    at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
    at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
    at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:84)
    at org.jboss.netty.channel.socket.nio.AbstractNioWorker.processSelectedKeys(AbstractNioWorker.java:471)
    at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:332)
    at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:35)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)

当我遇到此问题时,这是由于水槽通道的缓冲区大小不足所致。 调整通道的缓冲区大小以解决该问题。

对于内存通道,属性为: byteCapacityBufferPercentage

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM