[英]Deserializing avro is slow
I try to do a performance test with Java between several serialization formats including avro/protobuf/thrift and etc.我尝试使用 Java 在几种序列化格式(包括 avro/protobuf/thrift 等)之间进行性能测试。
Test bases on deserializing a byte array message having 30 long type fields for 1,000,000 times.测试基于反序列化具有 30 个长类型字段的字节数组消息 1,000,000 次。 The result for avro is not good. avro 的结果并不好。
protobuf/thrift uses around 2000 milliseconds in average, but it takes 9000 milliseconds for avro. protobuf/thrift 平均使用大约 2000 毫秒,但 avro 需要 9000 毫秒。
In the document it advice to reuse decoder, so I do the code as follow.在文档中建议重用解码器,所以我按如下方式执行代码。
byte[] bytes = readFromFile("market.avro");
long begin = System.nanoTime();
DatumReader<Market> userDatumReader = new ReflectDatumReader<>(Market.class);
InputStream inputStream = new SeekableByteArrayInput(bytes);
BinaryDecoder reuse = DecoderFactory.get().binaryDecoder(inputStream, null);
Market marketReuse = new Market();
for (int i = 0; i < loopCount; i++) {
inputStream = new SeekableByteArrayInput(bytes);
BinaryDecoder decoder = DecoderFactory.get().binaryDecoder(inputStream, reuse);
userDatumReader.read(marketReuse, decoder);
}
long end = System.nanoTime() - begin;
System.out.println("avro loop " + loopCount + " times: " + (end * 1d / 1000 / 1000));
I think avro should not be that slow, so I believe I do something wrong, but I am not sure what's the point.我认为 avro 不应该那么慢,所以我相信我做错了什么,但我不确定有什么意义。 Do I make the 'reuse' in a wrong way?我是否以错误的方式进行“重用”?
Is there any advice for avro performance testing?对 avro 性能测试有什么建议吗? Thanks in advance.提前致谢。
Took me a while to figure this one out.我花了一段时间才弄清楚这个。 But apparently但显然
DecoderFactory.get().binaryDecoder
is the culprit - it creates a buffer of 8KB every time it is invoked. DecoderFactory.get().binaryDecoder
是罪魁祸首——每次调用它都会创建一个 8KB 的缓冲区。 And this buffer is not re-used, but reallocated on every invocation.并且此缓冲区不会重复使用,而是在每次调用时重新分配。 I don't see any reason why there is a buffer involved in the first place.我看不出有任何理由首先涉及缓冲区。
The saner alternative is to use DecoderFactory.get().directBinaryDecoder
更明智的选择是使用DecoderFactory.get().directBinaryDecoder
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.