简体   繁体   中英

Attach unencrypted tag data to encrypted file

I hope this is the right place for my question, since there's definitely more than one way to do it.

I have a file format (xml) that I compress and encrypt. The thing is now that I want to attach some basic unencrypted meta-data to my file for ease of access to certain parameters.

Is there a right way to do what I want to do, otherwise what are some best practices to keep in mind?

The approach that I'm thinking about now is to use Bouncy Castle in C# to encrypt my actual data while prepending my tag data to the front of the file.

eg

<metadata>
    //tag information about the file
</metadata>
<secretdata>
    //Grandma's secret recipe
</secretdata>

Encrypt secret data only

<metadata>
    //tag information about the file
</metadata>
^&RF&^Tb87tyfg76rfvhjb8
hnjikhuhik*&GHd65rh87yn
NNCV&^FVU^R75rft78b875t

Combining non-encrypted and encrypted data using XML like you do is indeed one way to go. There are a few drawbacks which may or may not be relevant in your situation:

  • The compression is rather limited. If encrypted data is large, you should consider storing it in binary format directly. Also, CDATA may be a compromise, although the range of characters you'll be able to put in a CDATA is limited as well.

  • Parsing of XML may be slow if the encrypted data is large. Also, it often requires to keep the whole document in memory, which is probably not what you want. Again, storing encrypted data directly in binary format is a solution. CDATA won't help here.

  • The benefit of XML is to be readable by a human. Although relevant for metadata, it seems weird when most of data is encrypted anyway.

Other alternatives you may consider:

  1. Two files side by side. One will contain the binary data, and the other one (named identically but with a different extension) will have the metadata (for example in XML format). The difficulty is that you have to handle cases such as the presence of binary data file but not the corresponding metadata file or the opposite, as well as the copying/moving of data ( NTFS has transactions , but you have to use Interop, unless the latest version of .NET Framework adds the support for Transactional NTFS).

  2. Metadata and encrypted data stored in a single file in binary format. The answer by scottfavre shows one possibility to do it. I agree with his explanation, but would rather compress metadata as well for two reasons: (1) to save space and (2) to prevent the end users to modify the metadata by hand, which will make the header invalid.

    I won't recommend the single binary file approach since it makes the format difficult to use; the valid case for this would be if you found (after making enough benchmarks and profiling) that there is an important performance benefit.

  3. Metadata stored in Alternative Data Streams (which can be used in NTFS only, so beware of FAT-formatted flash drives). Here, the benefit is that you don't have to deal with offsets stored in a header: NTFS does that for you. But this is not an approach that I would recommend either, unless you absolutely need to keep the data together with the file, and you know that the file will always be stored on NTFS disks (and transferred with ADS-aware applications).

One challenge here is getting the plain-text XML out of the front of the file while leaving the input stream at exactly the start of the encrypted and compressed data. Since the XML reading libraries in C# were not built with this usage in mind, they may not behave well (eg - the reader may read more bytes than it needs, leaving the underlying stream past the start of the encrypted data).

One possible way to handle it is to prepend a header in a well-known format that provides the length of the XML metadata. So the file would look something like:

Header (5 bytes):
    Version* (1 byte, unsigned int)         = 1
    Metadata Length** (4 bytes, unsigned int) = N

Metadata (N bytes):
    well formed XML

Encrypted Data (rest of file)

(* -including versioning when defining a file format is always a good idea)

(** - if you're going to be exceeding the range of a 32-bit uint for the length of the metadata, you should consider another solution.)

Then you can read the 5 byte header directly, parse out the length of the XML, read that many bytes out exactly, and the input stream should be in the right place to start decrypting and decompressing the rest of the file.

Of course, now that you've got a binary header, you could consider just having the metadata in the header itself, instead of putting it in XML.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM