Why GZIPInputStream takes quite long time?

Question

System.out.println("Input String length : " + str.length());
System.out.println("SWB==="+sw.getTime());
byte[] bytes = Base64.decodeBase64(str);
System.out.println("SWB==="+sw.getTime());
GZIPInputStream gis = new GZIPInputStream(new ByteArrayInputStream(bytes));
BufferedReader bf = new BufferedReader(new InputStreamReader(gis));
String outStr = "";
String line;
while ((line=bf.readLine())!=null) {
     outStr += line;
}
System.out.println("SWB==="+sw.getTime());
System.out.println("Output String lenght : " + outStr.length());

The above code prints

SWB===1
SWB===4
SWB===27052
Output String lenght : 1750825

But the compression of the same string takes quite short time (less than 100ms). What am i doing wrong here ? (other than my bad way of debug comments)

Answer 1

The problem is this:

String line;
while ((line=bf.readLine())!=null) {
     outStr += line;
}

Each String concatenation will implicitly create a StringBuilder to append the 2 strings, then call toString() method on it.

Use a single StringBuilder to drastically speed this up:

StringBuilder sb = new StringBuilder(65536); // Consider a large initial size
String line
while ((line=bf.readLine())!=null) {
     sb.append(line);
}

// OutString is in the sb StringBuilder
String outStr = sb.toString();

Also consider a large initial StringBuilder size to even minimize the internal reallocations. In the example I used 64KB, but if you know your result String will be much bigger, you can even safely use multiple MBs.

Also consider not calling toString() on the result if you don't need it. StringBuilder implements CharSequence and many methods accept CharSequence as well as String s.

Why GZIPInputStream takes quite long time?

Question

1 answers

solution1
0 ACCPTED 2014-10-01 10:46:29

Why GZIPInputStream takes quite long time?

Question

1 answers

solution1 0 ACCPTED 2014-10-01 10:46:29

solution1
0 ACCPTED 2014-10-01 10:46:29