简体   繁体   English

在 Python 3 中解码十六进制字符串

[英]Decode Hex String in Python 3

In Python 2, converting the hexadecimal form of a string into the corresponding unicode was straightforward:在 Python 2 中,将字符串的十六进制形式转换为相应的 unicode 非常简单:

comments.decode("hex")

where the variable 'comments' is a part of a line in a file (the rest of the line does not need to be converted, as it is represented only in ASCII.其中,变量“评论”是在一个文件中(该行的其余部分的线的部分需要转换,因为它仅在ASCII表示。

Now in Python 3, however, this doesn't work (I assume because of the bytes/string vs. string/unicode switch. I feel like there should be a one-liner in Python 3 to do the same thing, rather than reading the entire line as a series of bytes (which I don't want to do) and then converting each part of the line separately. If it's possible, I'd like to read the entire line as a unicode string (because the rest of the line is in unicode) and only convert this one part from a hexadecimal representation.然而,现在在 Python 3 中,这不起作用(我假设是因为字节/字符串与字符串/unicode 开关。我觉得 Python 3 中应该有一个单行代码来做同样的事情,而不是阅读整行作为一系列字节(我不想这样做),然后分别转换行的每一部分。如果可能,我想将整行作为 unicode 字符串读取(因为其余的该行是 unicode),并且只从十六进制表示转换这一部分。

Something like:就像是:

>>> bytes.fromhex('4a4b4c').decode('utf-8')
'JKL'

Just put the actual encoding you are using.只需输入您正在使用的实际编码即可。

import codecs

decode_hex = codecs.getdecoder("hex_codec")

# for an array
msgs = [decode_hex(msg)[0] for msg in msgs]

# for a string
string = decode_hex(string)[0]

The answers from @unbeli and @Niklas are good, but @unbeli's answer does not work for all hex strings and it is desirable to do the decoding without importing an extra library (codecs). @unbeli 和 @Niklas 的答案很好,但 @unbeli 的答案不适用于所有十六进制字符串,最好在不导入额外库(编解码器)的情况下进行解码。 The following should work (but will not be very efficient for large strings):以下应该有效(但对于大字符串不会很有效):

>>> result = bytes.fromhex((lambda s: ("%s%s00" * (len(s)//2)) % tuple(s))('4a82fdfeff00')).decode('utf-16-le')
>>> result == '\x4a\x82\xfd\xfe\xff\x00'
True

Basically, it works around having invalid utf-8 bytes by padding with zeros and decoding as utf-16.基本上,它通过用零填充并解码为 utf-16 来解决无效的 utf-8 字节。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM