I try to do a performance test with Java between several serialization formats including avro/protobuf/thrift and etc.
Test bases on deserializing a byte array message having 30 long type fields for 1,000,000 times. The result for avro is not good.
protobuf/thrift uses around 2000 milliseconds in average, but it takes 9000 milliseconds for avro.
In the document it advice to reuse decoder, so I do the code as follow.
byte[] bytes = readFromFile("market.avro");
long begin = System.nanoTime();
DatumReader<Market> userDatumReader = new ReflectDatumReader<>(Market.class);
InputStream inputStream = new SeekableByteArrayInput(bytes);
BinaryDecoder reuse = DecoderFactory.get().binaryDecoder(inputStream, null);
Market marketReuse = new Market();
for (int i = 0; i < loopCount; i++) {
inputStream = new SeekableByteArrayInput(bytes);
BinaryDecoder decoder = DecoderFactory.get().binaryDecoder(inputStream, reuse);
userDatumReader.read(marketReuse, decoder);
}
long end = System.nanoTime() - begin;
System.out.println("avro loop " + loopCount + " times: " + (end * 1d / 1000 / 1000));
I think avro should not be that slow, so I believe I do something wrong, but I am not sure what's the point. Do I make the 'reuse' in a wrong way?
Is there any advice for avro performance testing? Thanks in advance.
Took me a while to figure this one out. But apparently
DecoderFactory.get().binaryDecoder
is the culprit - it creates a buffer of 8KB every time it is invoked. And this buffer is not re-used, but reallocated on every invocation. I don't see any reason why there is a buffer involved in the first place.
The saner alternative is to use DecoderFactory.get().directBinaryDecoder
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.