简体   繁体   English

使用Readread读取Java中的大文件

[英]Reading big files in Java using readfully

I have a big file of size 10gb, If i read its whole contents using readfully() in java, I get a outofmemoryerror, so i decided to read the big 10gb file in parts using same readfully(), for this i need to pass the offset and length parameters for readfully(). 我有一个大小为10gb的大文件,如果我在java中使用readfully()读取了全部内容,则会收到outofmemoryoryerror,所以我决定使用相同的readfully()来部分读取10gb的大文件,为此我需要通过readfully()的offset和length参数。 The offset must be of long or double datatype so that it can point to different parts of the file. 偏移量必须为long或double数据类型,以便它可以指向文件的不同部分。 But the readfully() accepts only int offset. 但是readfully()仅接受int偏移量。 How to read the big data? 如何读取大数据?

try {
    IOUtils.readFully(in, contents, minOffset, maxOffset);
    value.set(contents, 0, contents.length);
} finally {
    IOUtils.closeStream(in);
}

Can I use seek() to get to a specific position and then use readfully() from that position? 我可以使用seek()到达特定位置,然后从该位置使用readfully()吗?

Use the class java.util.Scanner to run through the contents of the file and retrieve lines serially, one by one: 使用类java.util.Scanner来遍历文件的内容并逐行依次检索行:

FileInputStream inputStream = null;
Scanner sc = null;
try {
    inputStream = new FileInputStream(path);
    sc = new Scanner(inputStream, "UTF-8");
    while (sc.hasNextLine()) {
        String line = sc.nextLine();
        // System.out.println(line);
    }
    // note that Scanner suppresses exceptions
    if (sc.ioException() != null) {
        throw sc.ioException();
    }
}
finally {
    if (inputStream != null) {
        inputStream.close();
    }
    if (sc != null) {
        sc.close();
    }
}

This solution will iterate through all the lines in the file – allowing for processing of each line – without keeping references to them – and in conclusion, without keeping them in memory. 此解决方案将遍历文件中的所有行-允许处理每一行-无需保留对其的引用-最后,也无需将其保留在内存中。 For more details see this . 欲了解更多详情,请参见

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM