Understanding how Rfc2898DeriveBytes works

Question

I'm writing an encryption sequence for sensitive data in our database.

Currently I'm taking a GUID based on the UserId, and putting that through a hash. Then, I run the hash through a Rfc2898DeriveBytes to get Key and IV which I use to encrypt the data using the Rijndael function.

My code looks like this:

        var salt = new byte[] { 1, 2, 23, 234, 37, 48, 134, 63, 248, 4 };
        const int iterations = 1000;
        using (var rfc2898DeriveBytes = new Rfc2898DeriveBytes(GenerateHash("2525"), salt, iterations)) {
            _key = rfc2898DeriveBytes.GetBytes(32);
            _iv = rfc2898DeriveBytes.GetBytes(16);
        }

I then pass the _key and _iv along to decrypt or encrypt the data. My goal is to have each user always have access to their unique key through every session. That being said, what can be randomized and still maintain this function? Do I always have to use the same salt and the same IV to get the data I want?

Answer 1

Rfc2898DeriveBytes is an implementation of PBKDF2. Obviously RFC 2898 is a reference to the standard where this Password Based Key Derivation Function has been defined. Note that the standard is broader than just the KDF; it's full title is "PKCS #5: Password-Based Cryptography Specification, Version 2.0".

PBKDF2 is a successor of PKCS#5 v1 which defined PBKDF / PBKDF1. The 1 was only added after PBKDF2 came into being. The class PasswordDeriveBytes is an implementation of PBKDF1. It should not be used anymore because both the KDF is outdated but also because Microsoft screwed up the implementation severely; it may repeat output keying material if more than the output of the underlying hash - SHA-1 so 20 bytes - is requested.

Besides being used as KDF, PBKDF2 can also be used as password hashing function, where the hash instead of the password is stored in a database. That way passwords can be verified, while the password cannot easily be retrieved even if the hash data is retrieved by an adversary. This is described in the followup RFC 8018 which contains the 2.1 version of the protocol.

Internally, PBKDF2 is just a repetition of a hash function over the password and salt. The iteration count is the work factor; it specifies how much work you (and adversaries) have to do before one hash is calculated. The salt makes sure that rainbow table attacks are impossible, and that identical passwords (of different users) don't lead to the same hash.

Due to a design error which requires the full amount of work to be repeated if more than one hash output is required, it is not recommended to request more data from it than the output of the hash function. In that case it is better to use another method to expand the output keying material (bytes), eg HKDF-Expand.

Observations on the code in the question:

The GenerateHash method is spurious, Rfc2898DeriveBytes will do this for you;
You should use something less predictable than a UID to create a key; the data should not be directly available to an attacker as this would completely defeat the purpose of PBKDF2;
If you want to use the same set of UID + salt + iterations for multiple encryption operations, then you should generate a random IV and prepend it to the ciphertext, having a non-random IV completely defeats the purpose of the IV;
You can change the salt to get multiple keys, but you would have to go through the PBKDF2 function for each and every encryption.

Just a general hint, only use the resulting key to encrypt data specific keys created out of a secure random function. Then you don't even need to bother about an IV, and you may be able to "re-encrypt" by decrypting the data specific key, and encrypting that with a new key.

Understanding how Rfc2898DeriveBytes works

Question

1 answers

solution1
6 ACCPTED 2012-08-22 21:07:47

Understanding how Rfc2898DeriveBytes works

Question

1 answers

solution1 6 ACCPTED 2012-08-22 21:07:47

solution1
6 ACCPTED 2012-08-22 21:07:47