简体   繁体   中英

AES Decryption using Pycrypto Python Exception : 'builtins.UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 0: invalid start byte'

I am using following implementation of AES cipher :-

import hashlib
from Crypto.Cipher import AES

class AESCipher:
    def __init__(self, key):
        self.BS = 128
        try:
            self.key = hashlib.sha256(key.encode()).digest()[:self.BS]
        except:
            self.key = hashlib.sha256(key).digest()[:self.BS]
        self.iv = Random.new().read(AES.block_size)
    def encrypt(self, raw):
        raw = self._pad(raw)
        cipher = AES.new(self.key, AES.MODE_CBC, self.iv)
        return base64.b64encode(self.iv + cipher.encrypt(raw))
    def decrypt(self, enc):
        enc = base64.b64decode(enc)
        self.iv = enc[:AES.block_size]
        cipher = AES.new(self.key, AES.MODE_CBC, self.iv)
        return self._unpad(cipher.decrypt(enc[AES.block_size:])).decode()
    def _pad(self, s):
        return s + (self.BS - len(s) % self.BS) * chr(self.BS - len(s) % self.BS).encode()
    @staticmethod
    def _unpad(s):
        return s[:-ord(s[len(s)-1:])]

Encryption for a binary encoded dictionary object causes no errors but when I try to decrypt the same encrypted object, following exception is raised :-

return self._unpad(cipher.decrypt(enc[AES.block_size:])).decode()
builtins.UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 0: invalid start byte

I tried to use 'ISO' and 'latin' encoding and decoding functions. But after that the socket on other side of the LAN recognizes it as a string and not as a dictionary object.

My question :- What I am doing wrong here ?

Additional information :-

key = 'SHSJDS-DSJBSJDS-DSKNDS' # some thing following this pattern

bin_json_object = pickle.dumps(dict_object)
enc_json_object = AESenc(bin_json_object, key)

def AESenc(self, data, key):
    return AESCipher(key).encrypt(data)

def AESdec(self, data, key):
    return AESCipher(key).decrypt(data)

For example If I use "ISO-8859-1" encoding in the above code :-

binary encoded representation of dictionary object :-

b'\x80\x03}q\x00(X\x02\x00\x00\x00idq\x01X$\x00\x00\x0096e09f6c-1e80-4cd1-9225-159e35bcacb4q\x02X\x0c\x00\x00\x00request_codeq\x03K\x01X\x0e\x00\x00\x00payload_lengthq\x04K!X\x0b\x00\x00\x00session_keyq\x05Nu.'

encrypted representation of binary encoded dictionary object :-

 b'cZi+L4Wi51B5oDGQKlFb9bioxKH3TFRO1piECklafwTe6GYm/VeVjJaCDKiI+o6f6CcUnMvx+2EfEwcHCH/KDDeHTivIUou7WGVrd1P++HxfYNutY/aOn30Y/yiICvwWRHBn/3zU3xXvr/4XrtoVddM2cQEgXupIcC99TIxurrr8CCZd74ZnWj6QB8quCtHD'

But if I now try to decrypt the same on other node on same LAN via socket. I get following decrypted representation :-

}q(XidqX$96e09f6c-1e80-4cd1-9225-159e35bcacb4qX
                                                              request_codeqKXpayload_lengthqK!X
                  session_keyqNu.

which is completely different from original binary representation of the same dictionary object. And produces the following exception :-

data = pickle.loads(data)
builtins.TypeError: 'str' does not support the buffer interface

Finally after hours of debugging I came with a working code, but I am not able to understand, why this is working. Please if someone could explain this in comments. Modified version AES cipher code :-

class AESCipher:
    def __init__(self, key):
        self.BS = AES.block_size
        try:
            self.key = hashlib.sha256(key.encode('ISO-8859-1')).digest()[:self.BS]
        except:
            self.key = hashlib.sha256(key).digest()[:self.BS]
        self.iv = Random.new().read(AES.block_size)
    def encrypt(self, raw):
        raw = self._pad(raw)
        cipher = AES.new(self.key, AES.MODE_CBC, self.iv)
        return base64.b64encode(self.iv + cipher.encrypt(raw))
    def decrypt(self, enc):
        enc = base64.b64decode(enc)
        self.iv = enc[:AES.block_size]
        cipher = AES.new(self.key, AES.MODE_CBC, self.iv)
        return self._unpad(cipher.decrypt(enc[AES.block_size:])).decode('ISO-8859-1')
    def _pad(self, s):
        return s + (self.BS - len(s) % self.BS) * chr(self.BS - len(s) % self.BS).encode('ISO-8859-1')
    @staticmethod
    def _unpad(s):
        print('returning : ', s[:-ord(s[len(s)-1:])])
        return s[:-ord(s[len(s)-1:])]

Now without modifying the AES encryption and decryption functions. I introduced a following variation in the code. Whenever another node receives a binary stream it first decrypts it with the AES decrypt function. But after decryption encoded dictionary object has to be encoded again with 'ISO-8859-1' as shown below :-

dict_object = self.AESdecryption(binary_stream, self.session_key)
dict = pickle.loads(dict_object.encode('ISO-8859-1'))
print(dict)

The above produces correct dictionary object. But what I don't understand is when a dictionary object was encrypted in 'ISO-8859-1' encoding, and and then decrypted on other node in 'ISO-8859-1' encoding, then why before passing it to the pickle.loads() I have to encode it again to get the original dictionary object. Please if someone could explain why it is happening ?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM