简体   繁体   English

如何从JSON文件Java中的特定索引获取数据

[英]How to get the data from specific index in json file java

JSONParser parses all json objects in a given file but i want to parse json objects starting from 100th index to end of the file. JSONParser解析给定文件中的所有json对象,但我想解析从第100个 索引开始到文件末尾的json对象。

I can do this later using subList but if I have 1 Million json objects in my json file I dont want to parse everything because efficiency will be reduced. 我可以稍后使用subList进行此subList但是如果我的json文件中有1百万个 json对象,我不想解析所有内容,因为效率会降低。

public static void readJsonFile() {

    JSONParser parser = new JSONParser();

    try {
        JSONArray a = (JSONArray) parser.parse(new FileReader("D:\\2018-4-21.json"));

        for (Object o : a.subList(100,a.size())) {
            JSONObject checkIn = (JSONObject) o;

            String userId = (String) checkIn.get("UserID");
            System.out.print(userId);

            String inout = (String) checkIn.get("INOUT");
            System.out.print("   " + inout);

            String swippedDateTime = (String) checkIn.get("SwippedDateTime");
            System.out.print("   " + swippedDateTime);

            System.out.println("");
        }
    } catch (FileNotFoundException e) {
        e.printStackTrace();
    } catch (IOException e) {
        e.printStackTrace();
    } catch (org.json.simple.parser.ParseException e) {
        e.printStackTrace();
    }
}

My Json File 我的Json文件

[
    {
        "UserID": "2",
        "INOUT": null,
        "SwippedDateTime": "2018-4-23 22:25"
    },
    {
        "UserID": "2",
        "INOUT": null,
        "SwippedDateTime": "2018-4-23 22:36"
    },
    {
        "UserID": "2",
        "INOUT": null,
        "SwippedDateTime": "2018-4-23 22:36"
    },
    {
        "UserID": "2",
        "INOUT": null,
        "SwippedDateTime": "2018-4-23 22:36"
    },
    {
        "UserID": "2",
        "INOUT": null,
        "SwippedDateTime": "2018-4-23 22:38"
    },
    {
        "UserID": "2",
        "INOUT": null,
        "SwippedDateTime": "2018-4-23 22:38"
    },
    {
        "UserID": "2",
        "INOUT": null,
        "SwippedDateTime": "2018-4-23 22:38"
    },
    {
        "UserID": "2",
        "INOUT": null,
        "SwippedDateTime": "2018-4-23 22:39"
    },
    {
        "UserID": "2",
        "INOUT": null,
        "SwippedDateTime": "2018-4-23 22:39"
    },
    {
        "UserID": "2",
        "INOUT": null,
        "SwippedDateTime": "2018-4-23 22:39"
    },
    {
        "UserID": "2",
        "INOUT": null,
        "SwippedDateTime": "2018-4-23 22:42"
    },
    {
        "UserID": "2",
        "INOUT": null,
        "SwippedDateTime": "2018-4-23 22:42"
    },
    {
        "UserID": "2",
        "INOUT": null,
        "SwippedDateTime": "2018-4-23 22:42"
    },
    {
        "UserID": "2",
        "INOUT": null,
        "SwippedDateTime": "2018-4-23 22:42"
    },
    {
        "UserID": "2",
        "INOUT": null,
        "SwippedDateTime": "2018-4-23 22:42"
    },
    {
        "UserID": "2",
        "INOUT": null,
        "SwippedDateTime": "2018-4-23 22:42"
    },
    {
        "UserID": "2",
        "INOUT": null,
        "SwippedDateTime": "2018-4-23 22:49"
    }
]

The only way to locate index 100, is to parse everything up to index 100. 定位索引100的唯一方法是解析所有内容直到索引100。

I think what you're really asking, is how to do that without creating unnecessary objects in memory. 我认为您真正要问的是如何做到这一点而不在内存中创建不必要的对象。

The answer to that can actually also help you manage files with millions of records, without running out of memory: 答案实际上还可以帮助您管理具有数百万条记录的文件,而不会耗尽内存:

Use a streaming parser. 使用解析器。

With a streaming parser, you will get the data as it is parsed, so you can quickly skip the first X records, then begin processing records one at a time, so you never have to keep more than one record in memory. 使用流解析器,您将获得解析后的数据,因此您可以快速跳过前X条记录,然后一次开始处理一条记录,因此您不必在内存中保留多条记录。

That mean you can actually parse files of unlimited size, with a very small memory footprint. 这意味着您实际上可以解析占用空间非常小的无限大小的文件。

Since you're using GSON, that means you need to use JsonReader instead of JsonParser . 由于您正在使用GSON,这意味着您需要使用JsonReader而不是JsonParser

If you have 1,000,000 records then memory usage is a concern. 如果您有1,000,000条记录,则需要考虑内存使用情况。

The most efficient way to do this is to manually read past the first part of the file -- in the case you have shown, all your records are the same size, so you could simply use InputStream.skip() -- of course if your String fields like UserID can be different lengths then this won't work. 最有效的方法是手动读取文件的第一部分-如果您已经显示,所有记录的大小都相同,因此您可以简单地使用InputStream.skip() -当然,如果您的String字段(如UserID可以是不同的长度,那么它将无法正常工作。

You could read the file character by character, counting (say) the commas to determine when you've skipped 100 records. 您可以逐字符读取文件,计数(说)逗号以确定何时跳过100条记录。

After you've skipped the first part of the file, you should use a streaming parser to read the rest. 跳过文件的第一部分后,应使用流解析器读取其余部分。 Gson will do that: https://sites.google.com/site/gson/streaming Gson会这样做: https : //sites.google.com/site/gson/streaming

You can also use a streaming parser to efficiently skip the first part of your file. 您还可以使用流解析器来有效地跳过文件的第一部分。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM