简体   繁体   English

已知格式时从字符串解析 JSON 的最快方法

[英]Fastest way to parse JSON from String when format is known

I want to parse a String into an internal JSON object (or equivalent) in Java.我想将字符串解析为 Java 中的内部 JSON object (或等效项)。 The usual libraries, Gson and Jackson , are way too slow for my needs (> 100us for each String to Json parse, according to my benchmarks).通常的库GsonJackson对我的需求来说太慢了(根据我的基准分析,每个字符串到 Json 解析 > 100us)。 I know there are slightly faster libraries, but looking at the benchmarks online, the gains available will be small (less than an order of magnitude improvement).我知道有稍微快一点的库,但是在线查看基准测试,可用的收益会很小(不到一个数量级的改进)。

If I know the format of the JSON in advance, is there a way I can parse it much faster?如果我提前知道 JSON 的格式,有没有办法可以更快地解析它? For example, I know the String will be a JSON of the format:例如,我知道字符串将是格式的 JSON:

{
   "A" : 1.0 ,
   "B" : "X"
}

ie, I know the two keys will be "A" and "B", and the values will be a double and a string, respectively.即,我知道这两个键将是“A”和“B”,值将分别是双精度和字符串。 Given this advanced knowledge of the format, is there a library or some approach to parse the JSON much faster than usual?鉴于这种格式的高级知识,是否有一个库或某种方法可以比平时更快地解析 JSON?

If you know a JSON payload structure you can use Streaming API to read data.如果您知道JSON有效负载结构,则可以使用Streaming API读取数据。 I created 4 different methods to read given JSON payload:我创建了 4 种不同的方法来读取给定JSON有效载荷:

  1. Default Gson - use Gson class.默认 Gson - 使用Gson class。
  2. Gson Adapter - use JsonReader from Gson library. Gson 适配器 - 使用JsonReader库中的 JsonReader。
  3. Default Jackson - use ObjectMapper from Jackson.默认 Jackson - 使用 Jackson 中的ObjectMapper
  4. Jackson streaming API - use JsonParser class. Jackson 流式传输 API - 使用JsonParser class。

To make it comparable all these methods take JSON payload as String and return Pojo object which represents A and B properties.为了使其具有可比性,所有这些方法都将JSON有效负载作为String并返回代表AB属性的Pojo object。 Below graph represents differences:下图表示差异: 在此处输入图像描述

As you can notice, Jackson 's Streaming API is the fastest way to deserialise your JSON payload from these 4 approaches.如您所见, JacksonStreaming API是从这 4 种方法中反序列JSON有效负载的最快方法。

To generate above graph, below data were used:为了生成上图,使用了以下数据:

1113 547 540 546 544 552 547 549 547 548 avg 603.3 1113 547 540 546 544 552 547 549 547 548 平均 603.3
940 455 452 456 465 459 457 458 455 455 avg 505.2 940 455 452 456 465 459 457 458 455 455 平均 505.2
422 266 257 262 260 267 259 262 257 259 avg 277.1 422 266 257 262 260 267 259 262 257 259 平均 277.1
202 186 184 189 185 188 182 186 187 183 avg 187.2 202 186 184 189 185 188 182 186 187 183 平均 187.2

Benchmark code:基准代码:

import com.fasterxml.jackson.annotation.JsonAutoDetect;
import com.fasterxml.jackson.annotation.PropertyAccessor;
import com.fasterxml.jackson.core.JsonFactory;
import com.fasterxml.jackson.core.JsonParser;
import com.fasterxml.jackson.databind.ObjectMapper;
import com.google.gson.Gson;
import com.google.gson.TypeAdapter;
import com.google.gson.stream.JsonReader;
import com.google.gson.stream.JsonWriter;

import java.io.IOException;
import java.time.Duration;
import java.util.ArrayList;
import java.util.List;
import java.util.stream.IntStream;

public class JsonApp {

    private static final String json = "{\"A\" : 1.0 ,\"B\" : \"X\"}";

    private static final int MAX = 1_000_000;

    private static List<List<Duration>> values = new ArrayList<>();

    static {
        IntStream.range(0, 4).forEach(i -> values.add(new ArrayList<>()));
    }

    public static void main(String[] args) throws Exception {
        for (int i = 0; i < 10; i++) {
            int v = 0;
            values.get(v++).add(defaultGson());
            values.get(v++).add(gsonAdapter());
            values.get(v++).add(defaultJackson());
            values.get(v).add(jacksonJsonFactory());
        }
        values.forEach(list -> {
            list.forEach(d -> System.out.print(d.toMillis() + " "));
            System.out.println(" avg " + list.stream()
                    .mapToLong(Duration::toMillis)
                    .average().getAsDouble());
        });
    }

    static Duration defaultGson() {
        Gson gson = new Gson();

        long start = System.nanoTime();
        for (int i = MAX; i > 0; i--) {
            gson.fromJson(json, Pojo.class);
        }

        return Duration.ofNanos(System.nanoTime() - start);
    }

    static Duration gsonAdapter() throws IOException {
        PojoTypeAdapter adapter = new PojoTypeAdapter();

        long start = System.nanoTime();
        for (int i = MAX; i > 0; i--) {
            adapter.fromJson(json);
        }

        return Duration.ofNanos(System.nanoTime() - start);
    }

    static Duration defaultJackson() throws IOException {
        ObjectMapper mapper = new ObjectMapper();
        mapper.setVisibility(PropertyAccessor.FIELD, JsonAutoDetect.Visibility.ANY);

        long start = System.nanoTime();
        for (int i = MAX; i > 0; i--) {
            mapper.readValue(json, Pojo.class);
        }

        return Duration.ofNanos(System.nanoTime() - start);
    }

    static Duration jacksonJsonFactory() throws IOException {
        JsonFactory jfactory = new JsonFactory();

        long start = System.nanoTime();
        for (int i = MAX; i > 0; i--) {
            readPartially(jfactory);
        }
        return Duration.ofNanos(System.nanoTime() - start);
    }

    static Pojo readPartially(JsonFactory jfactory) throws IOException {
        try (JsonParser parser = jfactory.createParser(json)) {

            Pojo pojo = new Pojo();

            parser.nextToken(); // skip START_OBJECT - {
            parser.nextToken(); // skip A name
            parser.nextToken();
            pojo.A = parser.getDoubleValue();
            parser.nextToken(); // skip B name
            parser.nextToken();
            pojo.B = parser.getValueAsString();

            return pojo;
        }
    }
}

class PojoTypeAdapter extends TypeAdapter<Pojo> {

    @Override
    public void write(JsonWriter out, Pojo value) {
        throw new IllegalStateException("Implement me!");
    }

    @Override
    public Pojo read(JsonReader in) throws IOException {
        if (in.peek() == com.google.gson.stream.JsonToken.NULL) {
            in.nextNull();
            return null;
        }

        Pojo pojo = new Pojo();

        in.beginObject();
        in.nextName();
        pojo.A = in.nextDouble();
        in.nextName();
        pojo.B = in.nextString();

        return pojo;
    }
}

class Pojo {

    double A;
    String B;

    @Override
    public String toString() {
        return "Pojo{" +
                "A=" + A +
                ", B='" + B + '\'' +
                '}';
    }
}

Note: if you need really precise data try to create benchmark tests using excellent JMH package.注意:如果您需要非常精确的数据,请尝试使用出色的JMH package 创建基准测试。

You can try BSON.你可以试试 BSON。 BSON is a binary object and runs faster than most JSON libraries BSON 是二进制 object,运行速度比大多数 JSON 库都快

 //import java.util.ArrayList;
 //import org.bson.Document;


 Document root = Document.parse("{ \"A\" : 1.0, \"B\" : \"X\" }");

 System.out.println((root.get("A")));
 System.out.println(((String)root.get("B")));

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM