简体   繁体   中英

Reading text file to string without huge memory consumption

I've tried to measure performance of several approaches to read a file into string using NIO (slowest for reading single file), BufferedInputStream and reading the file line after line (600 ms average per pass) and then this stream using Filereader and an array with fixed size acting as a buffer (fastest)

File was 95 MB of pure text in windows .txt file format. Converting chars to string really is the bottleneck, but what I noticed is HUGE memory consumption of this method. For 95 MB of lorem ipsum, this consumes up to 1 GB of RAM. I haven't found why.

What I have tried with no effect:

Issuing Garbage Collector by calling System.gc() Setting all the pointer variables to null before method ends (but they should be anyway, they are defined only within method).

private void testCharStream() {
            File f = f = new File("c:/Downloads/test.txt");
    long oldTime = System.currentTimeMillis();
    char[] cbuf = new char[8192];
    StringBuilder builder = new StringBuilder();
    try {

        FileReader reader = new FileReader(f);

        while (reader.read(cbuf) != -1) {
            builder.append(cbuf);
        }

        reader.close();
    } catch (IOException e) {
        // TODO Auto-generated catch block
        e.printStackTrace();
    }
    long currentTime = System.currentTimeMillis();

    System.out.println(currentTime - oldTime);
}

尝试Apache Commons IO: http : //commons.apache.org/proper/commons-io/我没有对其进行基准测试,但是我认为代码已经过优化。

I came up with decent solution. Using Apache Commons IO Package, memory peak was 777,1 MB , lowest 220 MB and 710 ms average needed for the 95 MB text file to be red.

What I did was to set variable with pointer to StringBuilder object to null at the end of method and suggested garbage colletor to actually do it's work (System.gc()). Memory peak is 540 MB , more than 1/2 of value previously achieved ! Also by changing buffer size to 1024 means 40 ms improvement per pass, from 490 to 450 or even less. So my function needs only 63.4 % of the Apache's time to read the file. That's almost 40 % less. Any ideas how to get the performance improved even more ?

Here is the function.

private void testCharStream() {
    long oldTime = System.currentTimeMillis();
    char[] cbuf = new char[1024];
    StringBuilder builder = new StringBuilder();

    try {

        FileReader reader = new FileReader(f);

        while (reader.read(cbuf) != -1) {
            builder.append(cbuf);
        }

        reader.close();
    } catch (IOException e) {
        e.printStackTrace();
    }
    long currentTime = System.currentTimeMillis();
    builder = null;
    System.gc();
    System.out.println(currentTime - oldTime);
}

To get better performance you can use BufferedReader . This class allows you to read the file line by line. Rather than wasting time by reading the file word by word this method will perform the task much faster. You can read a file of plain text (Size: 1 MB) in half second. Just use the following code.

File f=new File("File path");
FileReader fr=new FileReader(f)
BufferedReader br=new BufferedReader(fr);

String line="";
StringBuilder builder=new StringBuilder();
try {
while((line=br.readLine())!=null)
builder.append(line+"\\n");
}
catch(Exception e)
{
e.printStackTrace();
}

You can check the time it takes to read the file as you've used the System.currentTimeMillis() .

Take a look in the link below, reading Really big Files With Java(150GB).

[ http://www.answerques.com/s1imeegPeQqU/reading-really-big-files-with-java][1]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM