简体   繁体   中英

Inline input stream processing in Java

I need some help on below problem. I am working on a project where I need to deal with files. I get the handle of input stream from the user from which before writing it to disk I need to perform certain steps.

  • calculate the file digest
  • check for only 1 zip file present, unzip the data if zipped
  • dos 2 unix conversion
  • record length validation
  • and encrypt and save the file to disk

Also need to break the flow if there is any exception in the process I tried to use piped output and input stream, but the constraint is Java recommends it to run in 2 separate threads. Once I read from input stream I am not able to use it from other processing steps. Files can be very big so cannot cache all the data in buffer. Please provide your suggestions or is there any third party lib I can use for same.

The biggest issue is that you'll need to peek ahead in the provided InputStream to decide if you received a zipfile or not.

private boolean isZipped(InputStream is) throws IOException {
    try {
        return new ZipInputStream(is).getNextEntry() != null;
    } catch (final ZipException ze) {
        return false;
    }
}

After this you need to reset the inputstream to the initial position before setting up a DigestInputStream. Then read a ZipInputstream or the DigestInputstream directly. After you've done your processing, read the DigestInputStream to the end so you can obtain the digest. Below code has been validated through a wrapping "CountingInputstream" that keeps track of the total number of bytes read from the provided FileInputStream.

    final FileInputStream fis = new FileInputStream(filename);
    final CountingInputStream countIs = new CountingInputStream(fis);

    final boolean isZipped = isZipped(countIs);

    // make sure we reset the inputstream before calculating the digest
    fis.getChannel().position(0);
    final DigestInputStream dis = new DigestInputStream(countIs, MessageDigest.getInstance("SHA-256"));

    // decide which inputStream to use
    InputStream is = null;
    ZipInputStream zis = null;
    if (isZipped) {
        zis = new ZipInputStream(dis);
        zis.getNextEntry();
        is = zis;
    } else {
        is = dis;
    }

    final File tmpFile = File.createTempFile("Encrypted_", ".tmp");
    final OutputStream os = new CipherOutputStream(new FileOutputStream(tmpFile), obtainCipher());
    try {
        readValidateAndWriteRecords(is, os);
        failIf2ndZipEntryExists(zis);
    } catch (final Exception e) {
        os.close();
        tmpFile.delete();
        throw e;
    }

    System.out.println("Digest: " + obtainDigest(dis));
    dis.close();

    System.out.println("\nValidating bytes read and calculated digest");
    final DigestInputStream dis2 = new DigestInputStream(new CountingInputStream(new FileInputStream(filename)), MessageDigest.getInstance("SHA-256"));
    System.out.println("Digest: " + obtainDigest(dis2));
    dis2.close();

Not really relevant, but these are the helper methods:

private String obtainDigest(DigestInputStream dis) throws IOException {
    final byte[] buff = new byte[1024];
    while (dis.read(buff) > 0) {
        dis.read(buff);
    }
    return DatatypeConverter.printBase64Binary(dis.getMessageDigest().digest());
}

private void readValidateAndWriteRecords(InputStream is, final OutputStream os) throws IOException {
    final BufferedReader br = new BufferedReader(new InputStreamReader(is));
    // do2unix is done automatically by readline
    for (String line = br.readLine(); line != null; line = br.readLine()) {
        // record length validation
        if (line.length() < 1) {
            throw new RuntimeException("RecordLengthValidationFailed");
        }
        os.write((line + "\n").getBytes());
    }
}


private void failIf2ndZipEntryExists(ZipInputStream zis) throws IOException {
    if (zis != null && zis.getNextEntry() != null) {
        throw new RuntimeException("Zip File contains multiple entries");
    }
}

==> output:

Digest: jIisvDleAttKiPkyU/hDvbzzottAMn6n7inh4RKxPOc=
CountingInputStream closed. Total number of bytes read: 1100

Validating bytes read and calculated digest
Digest: jIisvDleAttKiPkyU/hDvbzzottAMn6n7inh4RKxPOc=
CountingInputStream closed. Total number of bytes read: 1072

Fun question, I may have gone overboard with my answer :)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM