使用Python 2.7.10解码ASCII字符串

Question

I'm fairly new to Python so I'm probably still making a lot of rookie mistakes. 我是Python的新手，所以我可能仍然会犯很多新手错误。

I was comparing two seemingly matching strings in Python, but it always returned false. 我正在比较Python中两个看似匹配的字符串，但始终返回false。 When I checked the representation of the object, I found that one of the strings was encoded in ASCII. 当我检查对象的表示形式时，我发现其中一个字符串是用ASCII编码的。

The representation of the first string returns: 第一个字符串的表示形式返回：

'\x00"\x00i\x00t\x00i\x00n\x00e\x00r\x00a\x00r\x00y\x00_\x00o\x00p\x00t\x00i\x00o\x00n\x00s\x00_\x00s\x00e\x00a\x00r\x00c\x00h\x00_\x00b\x00u\x00t\x00t\x00o\x00n\x00"\x00 \x00=\x00 \x00"\x00L\x00a\x00u\x00n\x00c\x00h\x00 \x00t\x00h\x00e\x00 \x00s\x00e\x00a\x00r\x00c\x00h\x00"\x00;\x00'

While the representation of the second string returns: 当第二个字符串的表示形式返回时：

"itinerary_options_search_button" = "Launch the search";

I'm trying to figure out how to decode the first string to get the second string, so that my comparison of the two will match. 我试图弄清楚如何解码第一个字符串以获得第二个字符串，以便我对两者的比较能够匹配。 When I decode the first string with 当我用第一个字符串解码时

string.decode('ascii')

I get a unicode object. 我得到一个unicode对象。 I'm not sure what to do to get the decoded string. 我不确定该怎么做才能得到解码后的字符串。

Answer 1

Your first string seems to have some issues. 您的第一个字符串似乎有一些问题。 I'm not entirely sure why there is so many null characters ( \\x00 ), but either way, we could write a function to clean those up: 我不完全确定为什么会有这么多的空字符（ \\x00 ），但是无论哪种方式，我们都可以编写一个函数来清除它们：

s_1 = '\x00"\x00i\x00t\x00i\x00n\x00e\x00r\x00a\x00r\x00y\x00_\x00o\x00p\x00t\x00i\x00o\x00n\x00s\x00_\x00s\x00e\x00a\x00r\x00c\x00h\x00_\x00b\x00u\x00t\x00t\x00o\x00n\x00"\x00 \x00=\x00 \x00"\x00L\x00a\x00u\x00n\x00c\x00h\x00 \x00t\x00h\x00e\x00 \x00s\x00e\x00a\x00r\x00c\x00h\x00"\x00;\x00'
s_2 = '"itinerary_options_search_button" = "Launch the search";'

def null_cleaner(string):
    new_string = ""
    for char in string:
        if char != "\x00":
            new_string += char
    return new_string

print(null_cleaner(s_1) == null_cleaner(s_2))

A little bit less robust way of doing this is to simply splice the string to remove every other character (which happens to be \\x00 ): 健壮性稍差的方法是简单地拼接字符串以删除所有其他字符（恰好是\\x00 ）：

s_1 = '\x00"\x00i\x00t\x00i\x00n\x00e\x00r\x00a\x00r\x00y\x00_\x00o\x00p\x00t\x00i\x00o\x00n\x00s\x00_\x00s\x00e\x00a\x00r\x00c\x00h\x00_\x00b\x00u\x00t\x00t\x00o\x00n\x00"\x00 \x00=\x00 \x00"\x00L\x00a\x00u\x00n\x00c\x00h\x00 \x00t\x00h\x00e\x00 \x00s\x00e\x00a\x00r\x00c\x00h\x00"\x00;\x00'
s_2 = '"itinerary_options_search_button" = "Launch the search";'

print(s_1[1::2] == s_2)

Answer 2

... encoded in ASCII. ...以ASCII编码。
 [lots of NULs] 

Nope. 不。

>>> '\x00"\x00i\x00t\x00i\x00n\x00e\x00r\x00a\x00r\x00y'.decode('utf-16be')
u'"itinerary'

Of course, your data has an extra NUL that will break it. 当然，您的数据还有一个额外的NUL会破坏它。 Once you clean that up you should be able to decode it with no problem. 清理完之后，您应该可以毫无问题地对其进行解码。

使用Python 2.7.10解码ASCII字符串

问题描述

2 个解决方案

解决方案1
0 已采纳 2018-07-10 17:49:52

解决方案2
0 2018-07-10 17:54:44

使用Python 2.7.10解码ASCII字符串

问题描述

2 个解决方案

解决方案1 0 已采纳 2018-07-10 17:49:52

解决方案2 0 2018-07-10 17:54:44

解决方案1
0 已采纳 2018-07-10 17:49:52

解决方案2
0 2018-07-10 17:54:44