简体   繁体   中英

Fast & Efficient Way To Read Large JSON Files Line By Line in Java

I have 100 millions of records in JSON file, need an efficient and fastest method to read the array of arrays from a JSON file in java .

JSON file look like:

[["XYZ",...,"ABC"],["XYZ",...,"ABC"],["XYZ",...,"ABC"],...,["XYZ",...,"ABC"],
 ["XYZ",...,"ABC"],["XYZ",...,"ABC"],["XYZ",...,"ABC"],...,["XYZ",...,"ABC"],
 ...
 ...
 ...
 ,["XYZ",...,"ABC"],["XYZ",...,"ABC"],["XYZ",...,"ABC"]]

I want to read this JSON file line by line as:

read first:

["XYZ",...,"ABC"]

then:

["XYZ",...,"ABC"]

so on:'

...
...
...
["XYZ",...,"ABC"]

How do I read a JSON file like this, I know it does not completely look like a JSON file but I need to read this file in this format which is saved as.JSON

You can use JSON Processing API (JSR 353) , to process your data in a streaming fashion:

import javax.json.Json;
import javax.json.stream.JsonParser;

...

String dataPath = "data.json";

try(JsonParser parser = Json.createParser(new FileReader(dataPath))) {
     List<String> row = new ArrayList<>();

     while(parser.hasNext()) {
         JsonParser.Event event = parser.next();
         switch(event) {
             case START_ARRAY:
                 continue;
             case VALUE_STRING:
                 row.add(parser.getString());
                 break;
             case END_ARRAY:
                 if(!row.isEmpty()) {
                     //Do something with the current row of data 
                     System.out.println(row);

                     //Reset it (prepare for the new row) 
                     row.clear();
                 }
                 break;
             default:
                 throw new IllegalStateException("Unexpected JSON event: " + event);
         }
     }
}

Please take a look on Jackson Streaming API,

I guess you are looking something like this - https://www.ngdata.com/parsing-a-large-json-file-efficiently-and-easily/

and this - https://stackoverflow.com/a/24838392/814304

Main thing - if you have a big file you need to read and process file lazy, piece by piece.

You can use JsonSurfer to extract all inner JSON array by a JsonPath: $[*]

    JsonSurfer surfer = JsonSurferJackson.INSTANCE;
    surfer.configBuilder().bind("$[*]", new JsonPathListener() {
        @Override
        public void onValue(Object value, ParsingContext context) {
            System.out.println(value);
        }
    }).buildAndSurf(json);

It won't load entire Json into memory. JSON array will be processed one by one.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM