简体繁体中英

Java performance of byte[] vs. char[] for file stream

原文 2011-08-15 03:59:47 9 3 java/ io

I'm writing a program that reads a file (uses custom buffer, 8KB), then finds a keyword in that buffer. Since Java provides two type of streams: character & byte, I've implemented this using both byte[] and char[] for buffering.

I just wonder, which would be faster and better for performance, since a char is 2 byte and when using Reader to read up char[] , the Reader will perform converting back from byte to char , which I think could make it slower than using only byte[] .

3 answers

Using a byte array will be faster:

You don't have the bytes to characters decoding step, which is at least a copy loop, and possibly more depending on the Charset used to do the decoding.
The byte array will take less space, and hence save CPU cycles in GC / initialization.

However:

Unless you are searching huge files, the difference is unlikely to be significant.
The byte array approach could FAIL if the input file is not encoded in an 8 bit character set. And even if it works (as it does for UTF-8 & UTF-16) there are potential issues with matching characters that span buffer boundaries.

(The reason that byte-wise treatment works for UTF-8 and UTF-16 is that the encoding makes it easy to distinguish between the first unit (byte or short) and subsequent units of an encoded character.)

If it's a binary file you're reading use a byte array.

If it's a text file and you're going to be using the contents like strings later then you should use a char array.

This stack overflow question file-streaming-in-java talks about streaming files efficiently in java.

I particularly likethis reference article

On large files, you quickly have advantages of speed using only bytes, so if you can decode the pattern through bytes you could definitively gain a few precious cycles.

If your files are small, or you don't have so many, maybe it's not worth the trouble.

What's the difference Between Character Stream and Byte Stream in Java and Char vs Byte in C?

Java 8 performance VS. Java 7

java reader vs. stream

Performance: Java vs. Database

Java — reading from a file. Input stream vs. reader

Java - stream of bytes vs. stream of characters?

Byte Stream vs Character Stream in Java

Java RandomAccessFile vs. DataInputStream for byte operations

Java Syntax: byte f()[] vs. byte[] f()

Java : Char vs String byte size

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question What's the difference Between Character Stream and Byte Stream in Java and Char vs Byte in C? Java 8 performance VS. Java 7 java reader vs. stream Performance: Java vs. Database Java — reading from a file. Input stream vs. reader Java - stream of bytes vs. stream of characters? Byte Stream vs Character Stream in Java Java RandomAccessFile vs. DataInputStream for byte operations Java Syntax: byte f()[] vs. byte[] f() Java : Char vs String byte size

Related Tags

Java performance of byte[] vs. char[] for file stream

Question

3 answers

solution1
6 ACCPTED 2011-08-15 04:11:24

solution2
1 2011-08-15 04:01:57

solution3
0 2011-08-15 04:27:42

Java performance of byte[] vs. char[] for file stream

Question

3 answers

solution1 6 ACCPTED 2011-08-15 04:11:24

solution2 1 2011-08-15 04:01:57

solution3 0 2011-08-15 04:27:42

solution1
6 ACCPTED 2011-08-15 04:11:24

solution2
1 2011-08-15 04:01:57

solution3
0 2011-08-15 04:27:42