简体   繁体   中英

Java-Writing huge file using Byteoutputstream

I am trying to write a file of size in between 1kb to 10GB using ByteArrayOutputStream but the below exception is thrown. I am using jdk 6. Please suggest any better high performance Api. I am using same network box to read and write.

Exception in thread "main" java.lang.OutOfMemoryError:   Requested array size exceeds VM limit
        at java.util.Arrays.copyOf(Unknown Source)
        at java.io.ByteArrayOutputStream.grow(Unknown Source)
        at java.io.ByteArrayOutputStream.ensureCapacity(Unknown Source)
        at java.io.ByteArrayOutputStream.write(Unknown Source)
        at java.io.OutputStream.write(Unknown Source)
        at

Code:

import java.io.BufferedOutputStream;
import java.io.ByteArrayOutputStream;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;


public class PrepareFile {

    /**
     * @param args
     * @throws Exception 
     */
    public static void main(String[] args) throws Exception {
        // TODO Auto-generated method stub

        new PrepareFile().constructFile("f:\\hello","f:\\output",10000000);

    }

    //Writes a large file of 10 GB using input file data of small size by duplicating
    public void constructFile(String fileName, String outPath, int multiplier) throws Exception {
        BufferedOutputStream fos = null;
        FileInputStream fis = null;

        final File inputFile = new File(fileName);
        String path = inputFile.getParent();
        if (outPath != null && !outPath.isEmpty()) {
            path = outPath;
        }

        fis = new FileInputStream(fileName);

        try {



            // read the transactions in the input file.
            byte[] txnData = new byte[(int) inputFile.length()];
            fis.read(txnData);

            final File outFile = new File(path, "Myfile");
            fos = new BufferedOutputStream(new FileOutputStream(outFile));
            final ByteArrayOutputStream baos = new ByteArrayOutputStream();
            final ByteArrayOutputStream baos1 = new ByteArrayOutputStream();

            //multiplier if input file size is 1 KB and output file is 10 GB, then multiplier value is (1024*1024)

            for (long i = 1; i <= multiplier; i++) {

                if(i >=40000 && i % 40000==0){
                    System.out.println("i value now: "+i);
                    baos.writeTo(fos);
                    baos.reset();
                    //baos.write(txnData);
                }

                // write transactions
                baos.write(txnData);
                baos1.write(txnData); //Exception is coming at this line
            }

            int Padding = myCustomMethod(baos1.toByteArray());

            // write all out data to the output stream
            baos.writeTo(fos);

            baos.flush();
            baos1.flush();
        } catch(Exception e){
            e.printStackTrace();
        }finally {
            fos.close();
            fis.close();
        }

    }

    public int myCustomMethod(byte[] b){

        //Need complete bytes to prepare the file trailer
        return 0;
    }


}

You can't have buffer of 2 GB or more in a ByteArrayOutputStream as the size is 32-bit signed.

If you want performance I would process the file progressively and avoid such large memory copies as they are really expensive.

BTW I have a library Chronicle Bytes which support buffers larger than 2 GB, and can be use native memory and mapped to files to avoid using the heap and can be larger than main memory.

However, if you process the data progressively you won't need such a large buffer.

I also suggest you use Java 8 as it performs 64-bit operations better than Java 6 (which was released ten years ago)


EDIT Based on your code, there is no need to use ByteArrayOutputStream and you can prepare the file progressively.

//Writes a large file of 10 GB using input file data of small size by duplicating
public void constructFile(String fileName, String outFileName, int multiplier) throws IOException {
    byte[] bytes;
    try (FileInputStream fis = new FileInputStream(fileName)) {
        bytes = new byte[fis.available()];
        fis.read(bytes);
    }

    try (FileOutputStream fos = new FileOutputStream(outFileName)) {
        for (int i = 0; i < multiplier; i++) {
            fos.write(bytes);
        }
    }

    // now process the file "outFileName"
    // how depends on what you are trying to do.
    // NOTE: It is entirely possible the file should be processed as it is written.
}

Although extreme, you can make a Super ByteArrayOutputStream which hides several ByteArrayOutputStreams inside (the example below uses 3 of them with maximum capacity 6 GB):

public class LargeByteArrayOutputOutputStream extends OutputStream {

    private DirectByteArrayOutputStream b1 = new DirectByteArrayOutputStream(Integer.MAX_VALUE -8);
    private DirectByteArrayOutputStream b2 = new DirectByteArrayOutputStream(Integer.MAX_VALUE -8);
    private DirectByteArrayOutputStream b3 = new DirectByteArrayOutputStream(Integer.MAX_VALUE -8);

    private long posWrite = 0;
    private long posRead = 0;
    @Override
    public void write(int b) throws IOException {
        if (posWrite < b1.getArray().length) {
            b1.write(b);
        } else if (posWrite < ((long)b1.getArray().length + (long)b2.getArray().length)) {
            b2.write(b);
        } else {
            b3.write(b);
        }
        
        posWrite++;

    }
    
    public long length() {
        return posWrite;
    }

    /** Propably you may want to read afterward */      
    public int read() throws IOException 
    {
        if (posRead > posWrite) {
            return (int)-1;
        } else {
            byte b = 0;
            if (posRead < b1.getArray().length) {
                b = b1.getArray()[(int)posRead];
            } else if (posRead < ((long)b1.getArray().length + (long)b2.getArray().length)) {
                b = b2.getArray()[(int)(posRead - b1.getArray().length)];
            } else {
                b = b3.getArray()[(int)(posRead - ((long)b1.getArray().length + (long)b2.getArray().length))];
            }
            
            posRead++;
            return b;
        }
    }       
}


public class DirectByteArrayOutputStream extends java.io.ByteArrayOutputStream {

 public DirectByteArrayOutputStream(int size) {
   super(size);
 }

 /**
 * Reference to the byte array that backs this buffer.
 */
 public byte[] getArray() {
   return buf;
 }       

 protected void finalize() throws Throwable
 {
  super.finalize();
 }
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM