简体   繁体   中英

why storing data directly using print() method is faster than storing it in a string and then writing to a file?

Lets consider this scenario: I am reading a file, and then tweaking each line a bit and then storing the data in a new file. Now, I tried two ways to do it:

  1. storing the data in a String and then writing it to the target file at the end like this:

      InputStream ips = new FileInputStream(file); InputStreamReader ipsr = new InputStreamReader(ips); BufferedReader br = new BufferedReader(ipsr); PrintWriter desFile = new PrintWriter(targetFilePath); String data = ""; while ((line = br.readLine()) != null) { if (line.contains("_Stop_")) continue; String[] s = line.split(";"); String newLine = s[2]; for (int i = 3; i < s.length; i++) { newLine += "," + s[i]; } data+=newLine+"\\n"; } desFile.write(data); desFile.close(); br.close(); 
  2. directly using println() method for PrintWriter as below in the while loop:

      while ((line = br.readLine()) != null) { if (line.contains("_Stop_")) continue; String[] s = line.split(";"); String newLine = s[2]; for (int i = 3; i < s.length; i++) { newLine += "," + s[i]; } desFile.println(newLine); } desFile.close(); br.close(); 

The 2nd process is way faster than the 1st one. Now, my question is what is happening so different in these two process that it is differing so much by execution time?

Appending to your string will:

  1. Allocate memory for a new string
  2. Copy all data previously copied.
  3. Copy the data from your new string.

You repeat this process for every single line, meaning that for N lines of output, you copy O(N^2) bytes around.

Meanwhile, writing to your PrintWriter will:

  1. Copy data to the buffer.
  2. Occasionally flush the buffer.

Meaning that for N lines of output, you copy only O(N) bytes around.

For one, you're creating an awful lot of new String objects by appending using +=. I think that'll definitely slow things down.

Try appending using a StringBuilder sb declared outside of the loop and then calling desFile.write(sb.toString()); and see how that performs.

First of all, the two processes aren't producing the same data, since the one that calls println will have line separator characters between the lines whereas the one that builds all the data up in a buffer and writes it all at once will not.

But the reason for the performance difference is probably the enormous number of String and StringBuilder objects you are generating and throwing away, the memory that needs to be allocated to hold the complete file contents in memory, and the time taken by the garbage collector.

If you're going to be doing a significant amount of string concatenation, especially in a loop, it is better to create a StringBuilder before the loop and use it to accumulate the results in the loop.

However, if you're going to be processing large files, it is probably better to write the output as you go. The memory requirements of your application will be lower, whereas if you build up the entire result in memory, the memory required will be equal to the size of the output file.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM