简体   繁体   English

将带有转义字符和ASCII值的字符串转换为十六进制

[英]Convert string with escaped characters and ASCII values into HEX

I have the following string which contains escaped HEX values and ASCII characters 我有以下字符串,其中包含转义的十六进制值和ASCII字符

"\01B\2E\00   k\00"

A backslash means that the next two characters are HEX values, everything else in the string is ASCII 反斜杠表示接下来的两个字符为十六进制值,字符串中的所有其他字符均为ASCII

The goal is to convert the entire string into all HEX values 目标是将整个字符串转换为所有十六进制值

end result: 最终结果:

"\01\42\2E\00\20\20\20\6B\00"

::edit:: ::编辑::

I had tried the following: 我尝试了以下方法:

s = "\01B\2E\00   k\00"
r = [ ]

for x in s:
    r.append(x.encode("hex"))

print r

the problem was the values (ex: \\1E ) were being interpreted as \\x01 and E 问题是值(例如:\\ 1E)被解释为\\ x01和E

Then i ran into the following question online came to post an update, and realized my question was already answered. 然后我在网上遇到了以下问题来发布更新,并意识到我的问题已经得到回答。

Thanks 谢谢

Here is a Python string that contains escaped hex values and ascii characters. 这是一个包含转义的十六进制值和ascii字符的Python字符串。

>>> s = r"\01B\2E\00   k\00"
>>> s
'\\01B\\2E\\00   k\\00'
>>> print(s)
\01B\2E\00   k\00

First we decode with the string-escape codec to convert the escaped hex values into the character representation. 首先,我们使用string-escape编解码器解码,以将转义的十六进制值转换为字符表示形式。 In order to use string-escape with hexadecimal values, we need to use \\x for the escape indicator rather than just \\ (which implies octal values iirc) 为了使用带有十六进制值的string-escape ,我们需要将\\x用作转义指示符,而不仅仅是使用\\ (这意味着八进制值iirc)

>>> escaped = s.replace('\\', '\\x').decode('string-escape')
>>> escaped
'\x01B.\x00   k\x00'
>>> print escaped
B.   k

Some of our characters are not printable. 我们的某些字符不可打印。 But the result is that all our characters are hex values/characters. 但是结果是我们所有的字符都是十六进制值/字符。

If you want to convert all the characters in the escaped hex representation , however, you will need to convert explicitly into hex values: 但是,如果要转换转义的十六进制表示形式中的所有字符,则需要显式转换为十六进制值:

>>> h = ''.join('\\' + char.encode('hex') for char in escaped)
>>> h
'\\01\\42\\2e\\00\\20\\20\\20\\6b\\00'
>>> print h
\01\42\2e\00\20\20\20\6b\00

Note that bytes-to-bytes encoding is deprecated in Python3. 请注意,Python3不推荐使用逐字节编码。 You would instead use the binascii .hexlify and .unhexlify methods. 您可以改用binascii .hexlify.unhexlify方法。

You could use re.split() to tokenize the string in Python: 您可以使用re.split()在Python中标记字符串:

>>> import re
>>> data = r"\01B\2E\00   k\00"
>>> L = re.split(r'((?:\\{hex}{hex})+)'.format(hex='[0-9a-fA-F]'), data)
>>> L
['', '\\01', 'B', '\\2E\\00', '   k', '\\00', '']
>>> L[::2] = [''.join('\\' + c.encode('hex') for c in s) for s in L[::2]]
>>> print ''.join(L)
\01\42\2E\00\20\20\20\6b\00

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM