简体   繁体   English

在python中解码URL编码的字节流数据

[英]decoding URL encoded byte stream data in python

I'm receiving STX ETX packet data, here's a sample: 我正在接收STX ETX数据包数据,这是一个示例: 收到POST请求

The data has been URL encoded. 数据已被URL编码。 Before it is encoded and sent it is like this: 在编码和发送之前,它是这样的: 数据正在发送给我

The relationship between the URL encoded data and the byte data before it is encoded and sent is this. URL编码数据和字节数据在编码和发送之前的关系是这样的。

0x41 -> A 
0xd9 -> %D9 
0x33 -> 3 
0x48 -> H 
0x58 -> X 
0x01 -> %01 
0x00 -> %00

After some research I have found that this is unicode code points being converted into hexidecimal numbers and unicode character names. 经过一些研究,我发现这是将Unicode代码点转换为十六进制数字和Unicode字符名称。 With the exception of the first byte which is an ascii character. 除了第一个字节是ASCII字符。

After the first character A, the following four bytes make up a 4 byte integer which is a UTC timestamp. 在第一个字符A之后,接下来的四个字节组成一个4字节的整数,它是UTC时间戳。

question

How do i convert the URL back into hexidecimal and unicode code points using python. 如何使用python将URL转换回十六进制和unicode代码点。 I've looked at the unicodedata module but can't seem to find a conversion from unicode character names to unicode code points. 我看过unicodedata模块,但似乎找不到从Unicode字符名称到Unicode代码点的转换。

Any help or suggestions would be much appreciated. 任何帮助或建议,将不胜感激。

You can use the urlparse module to decode that string. 您可以使用urlparse模块来解码该字符串。

import urlparse
data = "/type=stxetx&packet=A%d93HX%01%00&serial=1234&foo=bar"

new_data = dict(urlparse.parse_qsl(data))

assert len(new_data['packet']) == 7
assert new_data['packet'][0] == 'A'
assert ord(new_data['packet'][1]) == 0xd9

Reference: 参考:

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM