简体   繁体   中英

Strange behaviour of python3 in hex

I'm trying to exploit buffer and my code with python 3 looks like this:

python3 -c "print ('A' * 44 + '\xcb\x85\x04\x08')" | ./vuln

Or another way with 2.7 :

python2.7 -c "print 'A' * 44 + '\xcb\x85\x04\x08'" | ./vuln

But in this case only 2.7 works fine, so I tried to check hex:

python2.7 -c "print 'A' * 44 + '\\xcb\\x85\\x04\\x08'" | hexdump
0000020 4141 4141 4141 4141 4141 4141 85cb 0804
0000030 000a

python3 -c "print ('A' * 44 + '\\xcb\\x85\\x04\\x08')" | hexdump
0000020 4141 4141 4141 4141 4141 4141 8bc3 85c2
0000030 0804 000a

It doesn't depend on the system (I've tried on ubuntu and arch), doesn't depend on terminal (also tried different ones)
Looks like python 3 adds something and changes the memory but why and is it really normal?

In Python 2, strings and ranges of bytes are one and the same. This gave problems with non-ASCII strings so they changed it in Python 3. In Python 3 there is a bytes type, which is the one you want. The easiest way to construct a bytes string is to prepend the literal with a b :

b'A' * 44 + b'\xcb\x85\x04\x08'

However, you can't print a bytes directly like in Python 2. Python 3 will give a nice representation of your bytes, like this:

b'AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA\xcb\x85\x04\x08'

Obviously this is all ASCII and not the bytes you want. To write bytes directly to stdout, use sys.stdout.buffer.write:

python3 -c "import sys;sys.stdout.buffer.write(b'A' * 44 + b'\xcb\x85\x04\x08')"

Note that this doesn't write the newline at the end:

0000020 41 41 41 41 41 41 41 41 41 41 41 41 cb 85 04 08

This may seem all a bit of a hassle for outputting some bytes, especially compared to Python 2. This is because Python 3 really improved the way human readable text is output. One consequence of this is that it is harder to print bytes without any encoding, since this is normally incorrect when outputting text.

My guess is the 000a and 0804 000a are end-of-string and carriage return characters. I looked them up in the ASCII table.

ASCII descriptions of the following hex values:
0a = LF -> Newline.
00 = NULL
08 = backspace
04 = end of transmission

Why Python introduces these backspace and EOT chars is unclear to me. Try the python format function to format it to characters.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM