简体   繁体   中英

Java — reading from a file. Input stream vs. reader

In every Java implementation I see of reading from a file, I almost always see a file reader used to read line by line. My thought would be that this would be terribly inefficient because it requires a system call per line.

What I'd been doing instead is to use an input stream and grab the bytes directly. In my experiments, this is significantly faster. My test was a 1MB file.

    //Stream method
    try {
        Long startTime = new Date().getTime();

        InputStream is = new FileInputStream("test");
        byte[] b = new byte[is.available()];
        is.read(b);
        String text = new String(b);
        //System.out.println(text);

        Long endTime = new Date().getTime();
        System.out.println("Text length: " + text.length() + ", Total time: " + (endTime - startTime));

    }
    catch (Exception e) {
        e.printStackTrace();
    }

    //Reader method
    try {
        Long startTime = new Date().getTime();

        BufferedReader br = new BufferedReader(new FileReader("test"));
        String line = null;
        StringBuilder sb = new StringBuilder();
        while ((line = br.readLine()) != null) {
            sb.append(line);
            sb.append("\n");
        }
        String text = sb.toString();

        Long endTime = new Date().getTime();
        System.out.println("Text length: " + text.length() + ", Total time: " + (endTime - startTime));

    }
    catch (Exception e) {
        e.printStackTrace();
    }

This gives a result of:

Text length: 1054631, Total time: 9
Text length: 1034099, Total time: 22

So, why do people use readers instead of streams?

If I have a method that takes a text file and returns a String that contains all of the text, is it necessarily better to do it using a stream?

You are comparing apples to bananas. Reading one line at a time is going to be less efficient even with a bufferedReader than grabbing data as fast as possible. Note that use of available is discouraged, as it is not accurate in all situations. I found this out myself when I started using cipher streams.

FileReader is generally used in conjunction with a BufferedReader because frequently it makes sense to read a file line by line, specially if the file has a well-defined record structure where each record corresponds to a line.

Also, FileReader can simplify some of the work for dealing with character encodings and conversions, as stated in the javadocs :

Convenience class for reading character files. The constructors of this class assume that the default character encoding and the default byte-buffer size are appropriate ... FileReader is meant for reading streams of characters.

Try to increase BufferedReader buffer size. For example:

BufferedReader br = new BufferedReader(new FileReader("test"),2000000);

If you choose the right buffer size you will be faster.

Then in your sample with Reader you spend time filling the StringBuilder. You have to read file line by line if you need to process lines. But if you only need to read a text in a string then read bigger chunk of text with public int read(char[] cbuf) and write the chunks in a StringWriter initialized with a proper size.

Choose to use InputStream or Reader does not depends on performance. Generally you use Reader when you read text data, because with reader you can handle more easily the charset.

Another point, your code here

byte[] b = new byte[is.available()];
is.read(b);
String text = new String(b);

it is not correct. The documentation tells

Note that while some implementations of InputStream will return the total number of bytes in the stream, many will not. It is never correct to use the return value of this method to allocate a buffer intended to hold all data in this stream.

so pay attention, you need to fix it.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM