简体   繁体   中英

Decrypting a GPG-encrypted file in Python with raw key

I'm trying to use Python to decrypt a GPG-encrypted file using the raw key. Not the passphrase, not a nicely formatted file from a keyring, just the literal raw bytes of the key that the file was encrypted with.

I first created a test file:

~$ echo "It would be really cool if this worked" >> PGPDecryptorTest1.txt

I then encrypted the file using AES256 with the passphrase "a" and SHA256 key derivation:

~$ gpg --symmetric --s2k-mode 0 --s2k-digest-algo SHA256 --cipher-algo AES256 PGPDecryptorTest1.txt

I wrote the following short script to decode the file:

import sys 
from Crypto.Cipher import AES 

# With s2k-mode 0 specified, key is just SHA256 hash of passphrase
hash_a = b"\xca\x97\x81\x12\xca\x1b\xbd\xca\xfa\xc2\x31\xb3\x9a\x23\xdc\x4d\xa7\x86\xef\xf8\x14\x7c\x4e\x72\xb9\x80\x77\x85\xaf\xee\x48\xbb"
key = hash_a

def main(filename):
    
    with open(filename, "rb") as f:

        # First 9 bytes are header, ignore them and read the rest
        contents = f.read()[9:]

        # IV is size (block size + 2)
        # AES uses 16-byte (128-bit) blocks
        # Last two bytes are for checksum
        iv = contents[0:16 + 2]

        # Rest of contents should be ciphertext
        ciphertext = contents[16 + 2:]

        # Use openPGP special cipher mode
        cipher = AES.new(key, AES.MODE_OPENPGP, iv=iv)
        plaintext = cipher.decrypt(ciphertext)

        print("Output: " + str(plaintext))

if __name__ == "__main__":
    if (len(sys.argv) > 1): 
        main(sys.argv[1])
    else:
        main(input("Please specify an input file: "))

However, the output for this program is unintelligible garbage.

~$ python3 PGPDecryptor.py PGPDecryptorTest1.txt.gpg
Output: b'\x11\xd6\xf4\x8d\xf7/o.\x13k#D\xd1!\xce\xf5\xf9\xd9\x0b,\xdb\xe4\xd6,\xb8\x80\xcb2N\xd1^\x96\x8chP\xfb\xb0?Z\xb2\xed?\xce==\xfb9\xcf5o{\xb6\x12\xf3\xf7\xc9QC\xc3\xb5\xe4\x95ab?\x17\x9d\xd3\xd3\xc6\xa8j#K\x8cMf\xc6\x00V\x89Y\xe2\xe7~\xc4B\xd5\x1b\x8f\xe9&t'

I have verified the key by other methods, so I'm confident that it's correct. I must be very close to a proper solution, because changing either the IV or the key even slightly causes the following error to appear:

ValueError: Failed integrity check for OPENPGP IV

This suggests that I'm getting the key and IV correct. I've tried a nested for loop to try every valid combination of start and end indices for the ciphertext, just in case there was some additional garbage/header data somewhere, but with equally useless output for every combination.

If anyone can tell me what I'm doing wrong/how to correct it, I'd be very grateful. I suspect the error is very simple, but the nature of the problem makes it difficult to troubleshoot.

I currently have a janky alternative solution that involves modifying the pgpy library, but my problem with this is that importing large files to process (~500MB) takes a long time (~20-30 minutes). I looked at gnupg as well, but it's just a wrapper--it can decrypt with passphrases, but not with raw keys.

Using AES.MODE_OPENPGP would probably work for a Symmetrically Encrypted Data packet (tag 9), as it simply contains the encrypted data ( reference ).

However, that's not what you've produced with your gpg invocation. To get some insight into what we're actually dealing with, you can use the --list-packets command:

$ gpg --list-packets --verbose PGPDecryptorTest1.txt.gpg
gpg: AES256.CFB encrypted data
gpg: pinentry launched (90945 curses 1.1.0 /dev/pts/6 screen -)
gpg: encrypted with 1 passphrase
# off=0 ctb=8c tag=3 hlen=2 plen=4
:symkey enc packet: version 4, cipher 9, aead 0,s2k 0, hash 8
# off=6 ctb=d2 tag=18 hlen=2 plen=112 new-ctb
:encrypted data packet:
        length: 112
        mdc_method: 2
# off=27 ctb=a3 tag=8 hlen=1 plen=0 indeterminate
:compressed packet: algo=1
# off=29 ctb=ac tag=11 hlen=2 plen=66
:literal data packet:
        mode b (62), created 1652044966, name="PGPDecryptorTest1.txt",
        raw data: 39 bytes

Two things of note:

  • The encrypted data packet is tag 18, which is a Symmetrically Encrypted Integrity Protected Data packet. We're no longer dealing with only the output of the cipher, but data preceded with a version # and suffixed with with a Modification Detection Code packet ( reference ).
  • The encrypted data packets contents are compressed.

WARNING: The code below is just a rough demonstration of poking around OpenPGP message format. It is brittle and shouldn't be reused. The main takeaway is that reliably parsing OpenPGP messages isn't trivial and you should use a well tested library.

The main references I used are:


To more easily demonstrate digging into this, I've produced an encrypted message with compression turned off:

$ cat PGPDecryptorTest1.txt
It would be really cool if this worked

$ gpg --symmetric -o PGPDecryptorTest1.txt.uncompressed.gpg --compress-level 0 --s2k-mode 0 --s2k-digest-algo SHA256 --cipher-algo AES256 PGPDecryptorTest1.txt
gpg: Note: simple S2K mode (0) is strongly discouraged

$ python3 solution.py PGPDecryptorTest1.txt.uncompressed.gpg
contents(len: 117): b'8c0404090008d26d017795712a4686d1a176a0f150a33b9c972d876948df739b1058a513f916ef8094c80ae65ed022c30e1108d20dbeaeee70285e8736e8184520ceb0c435feafdd856051eb166e96e32e82ba51a3af4d230174e97a8f3a3529606b6558fce716bf3b0e9b856d442f5104f3647af0'
decrypted_iv(len: 16): b'49355c68e8e3eba7cc5ccb529d158a2c'
first_block(len: 16): b'8a2cac42621550475044656372797074'
decrypted_data (first block)(len: 14): b'ac42621550475044656372797074'
decrypted_data(len: 68): b'ac426215504750446563727970746f7254657374312e74787462783732497420776f756c64206265207265616c6c7920636f6f6c206966207468697320776f726b65640a'
plaintext(len: 39): b'497420776f756c64206265207265616c6c7920636f6f6c206966207468697320776f726b65640a'
It would be really cool if this worked

Here's the solution implementation:

import binascii
import hashlib
import sys
from Cryptodome.Cipher import AES

def print_bytes(name, data):
    print("%s(len: %d): %s" % (name, len(data), str(binascii.hexlify(data))))

def main(filename):
    # Generate key material from the passphrase.
    passphrase = b"a"
    m = hashlib.sha256()
    m.update(passphrase)
    key = m.digest()

    # Get file contents.
    with open(filename, "rb") as f:
        contents = f.read()
    print_bytes("contents", contents)

    # Constants
    header_len = 9  # including the 1-octet type-19 version identifier
    block_size = 16 # alogorithm details should normally be extracted from the header
    segment_size = block_size * 8
    iv_len = block_size
    iv_tag_len = 2
    mdc_len = 22

    # "Manually" decrypting to adhere to
    # https://datatracker.ietf.org/doc/html/rfc4880#section-5.13
    # Doing it this way helps with the integrity check, which I ended
    # skipping.
    cipher = AES.new(key, AES.MODE_CFB, iv=b"\x00" * block_size, segment_size=segment_size)

    offset = header_len

    decrypted_iv = cipher.decrypt(contents[offset:offset+block_size])
    print_bytes("decrypted_iv", decrypted_iv)

    decrypted_data = bytearray()

    offset += block_size

    first_block = cipher.decrypt(contents[offset:offset+block_size])
    print_bytes("first_block", first_block)

    offset += block_size

    if first_block[:2] != decrypted_iv[-2:]:
        print("IV check failed")
        sys.exit(1)

    decrypted_data.extend(first_block[2:])
    print_bytes("decrypted_data (first block)", decrypted_data)

    padding = block_size - (len(contents)-offset) % block_size
    contents += b"\x00" * padding

    decrypted_data.extend(cipher.decrypt(contents[offset:]))

    # Here is where you should parse the MDC packet and use it for integrity
    # checking. Instead, skipping the check and discarding the packet for 
    # brevity.
    decrypted_data = decrypted_data[:-(mdc_len+padding)]
    print_bytes("decrypted_data", decrypted_data)

    # Extract filename length so we can find the plaintext offset.
    # see https://datatracker.ietf.org/doc/html/rfc4880#section-5.9
    filename_len = decrypted_data[3]
    plaintext_offset = (
        2  # header
        + 1  # file type
        + 1  # filename length
        + filename_len  # filename contents
        + 4  # timestamp
    )
    plaintext = decrypted_data[plaintext_offset:]
    print_bytes("plaintext", plaintext)
    print(plaintext.decode())


if __name__ == "__main__":
    if (len(sys.argv) > 1):
        main(sys.argv[1])
    else:
        print("first argument must be file to decrypt")
        sys.exit(1)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM