简体   繁体   中英

Fastest possible Javascript object serialization with Google V8

I need to serialize moderately complex objects with 1-100's of mixed type properties.

JSON was used originally, then I switched to BSON which is marginally faster.

Encoding 10000 sample objects

JSON:        1807mS
BSON:        1687mS
MessagePack: 2644mS (JS, modified for BinaryF)

I want an order of magnitude increase; it is having a ridiculously bad impact on the rest of the system.

Part of the motivation to move to BSON is the requirement to encode binary data, so JSON is (now) unsuitable. And because it simply skips the binary data present in the objects it is "cheating" in those benchmarks.

Profiled BSON performance hot-spots

  • (unavoidable?) conversion of UTF16 V8 JS strings to UTF8.
  • malloc and string ops inside the BSON library

The BSON encoder is based on the Mongo BSON library.

A native V8 binary serializer might be wonderful, yet as JSON is native and quick to serialize I fear even that might not provide the answer. Perhaps my best bet is to optimize the heck out of the BSON library or write my own plus figure out far more efficient way to pull strings out of V8. One tactic might be to add UTF16 support to BSON.

So I'm here for ideas, and perhaps a sanity check.

Edit

Added MessagePack benchmark. This was modified from the original JS to use BinaryF.

The C++ MessagePack library may offer further improvements, I may benchmark it in isolation to compare directly with the BSON library.

I made a recent (2020) article and benchmark comparing binary serialization libraries in JavaScript.

The following formats and libraries are compared:

  • Protocol Buffer: protobuf-js , pbf , protons , google-protobuf
  • Avro: avsc
  • BSON: bson
  • BSER: bser
  • JSBinary: js-binary

Based on the current benchmark results I would rank the top libraries in the following order (higher values are better, measurements are given as x times faster than JSON):

  1. avsc : 10x encoding, 3-10x decoding
  2. js-binary : 2x encoding, 2-8x decoding
  3. protobuf-js : 0.5-1x encoding, 2-6x decoding,
  4. pbf : 1.2x encoding, 1.0x decoding
  5. bser : 0.5x encoding, 0.5x decoding
  6. bson : 0.5x encoding, 0.7x decoding

I did not include msgpack in the benchmark as it is currently slower than the build-in JSON library according to its NPM description.

For details, see the full article .

For serialization / deserialization protobuf is pretty tough to beat. I don't know if you can switch out the transport protocol. But if you can protobuf should definitely be considered.

Take a look at all the answers to Protocol Buffers versus JSON or BSON .

The accepted answer chooses thrift . It is however slower than protobuf. I suspect it was chosen for ease of use (with Java) not speed. These Java benchmarks are very telling.
Of note

  • MongoDB-BSON 45042
  • protobuf 6539
  • protostuff/protobuf 3318

The benchmarks are Java, I'd imagine that you can achieve speeds near the protostuff implementation of protobuf, ie 13.5 times faster. Worst case (if for some reason Java is just better for serialization) you can do no worse the the plain unoptimized protobuf implementation which runs 6.8 times faster.

Take a look at MessagePack . It's compatible with JSON. From the docs:

Fast and Compact Serialization

MessagePack is a binary-based efficient object serialization library. It enables to exchange structured objects between many languages like JSON. But unlike JSON, it is very fast and small.

Typical small integer (like flags or error code) is saved only in 1 byte, and typical short string only needs 1 byte except the length of the string itself. [1,2,3] (3 elements array) is serialized in 4 bytes using MessagePack as follows:

If you are more interested on the de-serialisation speed, take a look at JBB (Javascript Binary Bundles) library. It is faster than BSON or MsgPack.

From the Wiki, page JBB vs BSON vs MsgPack :

...

  • JBB is about 70% faster than Binary-JSON (BSON) and about 30% faster than MsgPack on decoding speed, even with one negative test-case (#3).
  • JBB creates files that (even their compressed versions) are about 61% smaller than Binary-JSON (BSON) and about 55% smaller than MsgPack.

...

Unfortunately, it's not a streaming format, meaning that you must pre-process your data offline. However there is a plan for converting it into a streaming format (check the milestones).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM