简体   繁体   English

用杰克逊解析巨大的JSON

[英]Parsing Huge JSON with Jackson

Consider a huge JSON with structure like - 考虑一个巨大的JSON结构,如 -

{"text": "very HUGE text here.."}

I am storing this JSON as an ObjectNode object called say json . 我将此JSON存储为名为say jsonObjectNode对象。

Now I try to extract this text from the ObjectNode . 现在我尝试从ObjectNode提取此文本。

String text = json.get("text").asText()

This JSON can be like 4-5 MB in size. 这个JSON的大小可以是4-5 MB。 When I run this code, I dont get a result (program keeps executing forever). 当我运行这段代码时,我得不到结果(程序一直在执行)。

The above method works fine for small and normal sized strings. 上述方法适用于小型和普通大小的字符串。 Is there any other best practice to extract huge data from JSON? 还有其他最佳实践从JSON中提取大量数据吗?

test with jackson(fastxml), 7MB json node can be parsed in 200 milliseconds 用jackson(fastxml)测试,7MB json节点可以在200毫秒内解析

    ObjectMapper objectMapper = new ObjectMapper();
    InputStream is = getClass().getResourceAsStream("/test.json");
    long begin = System.currentTimeMillis();
    Map<String,String> obj = objectMapper.readValue(is, HashMap.class);
    long end = System.currentTimeMillis();
    System.out.println(obj.get("value").length() + "\t" + (end - begin));

the output is: 7888888 168 输出为:7888888 168

try to upgrade you jackson? 试着升级你杰克逊?

Perhaps your default heap size is too small: if input is 5 megs UTF-8 encoded, Java String of it will usually need 10 megs of memory ( char is 16-bits, most UTF-8 for english chars is single byte). 也许你的默认堆大小太小:如果输入是5兆UTF-8编码,它的Java String通常需要10兆内存( char是16位,英语字符的大多数UTF-8是单字节)。 There isn't much you can do about this, regardless of JSON library, if value has to be handled as Java String ; 如果必须将值作为Java String处理,那么无论JSON库如何,您都无法做到这一点。 you need enough memory for the value and rest of processing. 你需要足够的内存来处理值和剩余的处理。 Further, since Java heap is divided into different generations, 64 megs may or may not work: since 10 megs needs to be consecutive, it probably gets allocated in the old generation. 此外,由于Java堆被分成不同的代,64兆可能或可能不起作用:因为10兆需要连续,它可能在旧一代中分配。

So: see try with bigger heap size and see how much you need. 所以:看看尝试更大的堆大小,看看你需要多少。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM