简体   繁体   中英

Read file as 1KB chunks using java

I am trying to read a file from memory and split it into 1KB chunks.

What the program does is it reads a file(video file) from memory and then splits it into 1KB chunks. It then hashes last chunk using SHA-256 and appends the hash to the second last chunk. It then computes the hash on the second last chunk and the appended hash together and then appends this hash to its previous chunk. This goes on until the first chunk, which will have the hash of second chunk appended to it.

I just need the hash of the first chunk and its appended hash. I have tried to implement this in two ways, but I think I am doing this wrong. Could someone please tell me where I am doing wrong. I have been stuck at this for 6 days without a solution. I have pasted both of my implementations below. Any help would be appreciated.

I have read the entire file and tried to split the byte array to 1KB chunks manually in the below attempt.

package com.test;

import java.io.BufferedInputStream;
import java.io.FileInputStream;
import java.io.InputStream;
import java.security.MessageDigest;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;

public class ReadFileByteByByte {

    public static void main(String[] args) throws Exception {

        InputStream inStream = null;
        BufferedInputStream bis = null;

        try{
            inStream = new FileInputStream("C:\\a.mp4");

            bis = new BufferedInputStream(inStream);

            int numByte = bis.available();


            byte[] buf = new byte[numByte];
            bis.read(buf, 0, numByte);
            System.out.println(numByte/1024);
            ArrayList<byte[]> a = new ArrayList<>();
            ArrayList<byte[]> b = new ArrayList<>();
            for(int i=0,j=0;i<buf.length;i++,j++){
                byte[] buf2 = new byte[1057];
                buf2[j] = buf[i];
                if(i%1024==1023){
                    a.add(buf2);
                    j=0;
                }
            }

            for(int i=a.size()-1,j=-1;i>=0;i--,j++){
                MessageDigest digest = MessageDigest.getInstance("SHA-256");
                if(i==a.size()-1){
                    byte[] hash = digest.digest(a.get(i));
                    byte[] dest = new byte[a.get(i).length+hash.length];
                    System.arraycopy(a.get(i-1), 0, dest, 0, a.get(i-1).length);
                    System.arraycopy(hash, 0, dest, a.get(i-1).length, hash.length);
                    b.add(dest);
                }
                else{
                    byte[] hash = digest.digest(b.get(0));
                    if(i!=0){
                        byte[] dest = new byte[a.get(i-1).length+hash.length];
                        System.arraycopy(a.get(i-1), 0, dest, 0, a.get(i-1).length);
                        System.arraycopy(hash, 0, dest, a.get(i-1).length, hash.length);
                        b.clear();
                        b.add(dest);
                    }else{
                        System.out.println(bytesToHex(hash));}
                }

            }

        }catch(Exception e){
            e.printStackTrace();
        }finally{
            if(inStream!=null)
                inStream.close();
            if(bis!=null)
                bis.close();
        }   
    }
    final protected static char[] hexArray = "0123456789ABCDEF".toCharArray();
    public static String bytesToHex(byte[] bytes) {
        char[] hexChars = new char[bytes.length * 2];
        for ( int j = 0; j < bytes.length; j++ ) {
            int v = bytes[j] & 0xFF;
            hexChars[j * 2] = hexArray[v >>> 4];
            hexChars[j * 2 + 1] = hexArray[v & 0x0F];
        }
        return new String(hexChars);
    }
}

I have read the file as 1KB chunks directly in this attempt. The hashing takes a very long time for some reason in this attempt.

package com.test;

import java.io.BufferedInputStream;
import java.io.FileInputStream;
import java.io.InputStream;
import java.security.MessageDigest;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;

public class ReadFileByteByByte2 {

   public static void main(String[] args) throws Exception {

      InputStream inStream = null;
      BufferedInputStream bis = null;

      try{
         inStream = new FileInputStream("C:\\aa.mp4");

         bis = new BufferedInputStream(inStream);

         int numByte = bis.available();

         System.out.println(numByte/1024);
         ArrayList<byte[]> a = new ArrayList<>();
         ArrayList<byte[]> b = new ArrayList<>();
         byte[] buf = new byte[numByte];
         int ii=0;
         while(bis.read(buf, ii, 1024)!=-1){
                 a.add(buf);
         }
         System.out.println(a.size());
         for(int i=a.size()-1,j=-1;i>=0;i--,j++){
             MessageDigest digest = MessageDigest.getInstance("SHA-256");
             if(i==a.size()-1){
                 System.out.println(a.get(i).toString());
                 byte[] hash = digest.digest(a.get(i));
                 byte[] dest = new byte[a.get(i).length+hash.length];
                 System.arraycopy(a.get(i-1), 0, dest, 0, a.get(i-1).length);
                 System.arraycopy(hash, 0, dest, a.get(i-1).length, hash.length);
                 b.add(dest);
             }
             else{
                 System.out.println(i);
                 byte[] hash = digest.digest(b.get(0));
                 if(i!=0){
                     byte[] dest = new byte[a.get(i-1).length+hash.length];
                     System.arraycopy(a.get(i-1), 0, dest, 0, a.get(i-1).length);
                     System.arraycopy(hash, 0, dest, a.get(i-1).length, hash.length);
                     b.clear();
                     b.add(dest);
                 }else{
                 System.out.println(bytesToHex(hash));}
             }

         }

         }catch(Exception e){
            e.printStackTrace();
         }finally{
            if(inStream!=null)
               inStream.close();
            if(bis!=null)
               bis.close();
      } 
   }
   final protected static char[] hexArray = "0123456789ABCDEF".toCharArray();
   public static String bytesToHex(byte[] bytes) {
        char[] hexChars = new char[bytes.length * 2];
        for ( int j = 0; j < bytes.length; j++ ) {
            int v = bytes[j] & 0xFF;
            hexChars[j * 2] = hexArray[v >>> 4];
            hexChars[j * 2 + 1] = hexArray[v & 0x0F];
        }
        return new String(hexChars);
    }
}

Any help is really appreciated. Thanks in advance.

Firstly, you must use DataInputStream.readFully() to ensure you really do get 1k chunks, and make sure you don't use it on the last chunk if it is shorter than the others. read() isn't guaranteed to fill the buffer, or return any count greater than one. See the Javadoc.

Secondly, you are misusing available(). It doesn't do what you're using it for: it tells you how many bytes can be read without blocking. It isn't valid as an EOS test, nor as a means of getting the length of the stream. See the Javadoc. In this case you don't need it at all, just File.length().

Thirdly, you don't literally need to append the hash of a block to the block so you can compute the next hash. Just call digest.update() on the block data and then digest.doFinal() supplying the previous hash as the argument, and you will get exactly the same value.

Fourth, I'm wondering whether you have understood your requirement correctly. It would make more sense to compute the hashes in the forward direction. Then you wouldn't need to read the entire file into memory at all. The added integrity is the same either way.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM