[英]Send http post request with unicoded data using python requests
I want to send a post request to REST api, but with all my characters unicode encoded, for example the string test
i want to send as \t\e\s\t
.我想向 REST api 发送一个 post 请求,但是我所有的字符都是 unicode 编码的,例如我想作为\t\e\s\t
发送的字符串test
。 Whatever i try, the string ends up as \\\t\\\e\\\s\\\t
.无论我尝试什么,字符串最终都是\\\t\\\e\\\s\\\t
。 I can easily modify the request in for example Burp, and remove the double backslashes, to make it work.我可以很容易地修改例如 Burp 中的请求,并删除双反斜杠,使其工作。
So the raw bytes sent to the webserver is \\x5c\\x5c\\x75\\x30\\x30\\x37\\x34\\x5c\\x5c\\x75\\x30\\x30\\x36\\x35\\x5c\\x5c\\x75\\x30\\x30\\x37\\x33\\x5c\\x5c\\x75\\x30\\x30\\x37\\x34
所以发送到网络服务器的原始字节是\\x5c\\x5c\\x75\\x30\\x30\\x37\\x34\\x5c\\x5c\\x75\\x30\\x30\\x36\\x35\\x5c\\x5c\\x75\\x30\\x30\\x37\\x33\\x5c\\x5c\\x75\\x30\\x30\\x37\\x34
While what i want is: \\x5c\\x75\\x30\\x30\\x37\\x34\\x5c\\x75\\x30\\x30\\x36\\x35\\x5c\\x75\\x30\\x30\\x37\\x33\\x5c\\x75\\x30\\x30\\x37\\x34
而我想要的是: \\x5c\\x75\\x30\\x30\\x37\\x34\\x5c\\x75\\x30\\x30\\x36\\x35\\x5c\\x75\\x30\\x30\\x37\\x33\\x5c\\x75\\x30\\x30\\x37\\x34
One of the things I've tried is this:我尝试过的一件事是:
import requests
s = 'test'
data = ''
for c in s:
data += "\\u00"+hex(ord(c))[2:].lower()
print(data)
json = {"user":data}
res = requests.post('http://127.0.0.1/api/getusers', json=json)
print(res.text)
even if i set data = '\\x5c\\x75\\x30\\x30\\x37\\x34\\x5c\\x75\\x30\\x30\\x36\\x35\\x5c\\x75\\x30\\x30\\x37\\x33\\x5c\\x75\\x30\\x30\\x37\\x34'
is still sends double back slahes ( \\x5x\\x5c
)即使我设置data = '\\x5c\\x75\\x30\\x30\\x37\\x34\\x5c\\x75\\x30\\x30\\x36\\x35\\x5c\\x75\\x30\\x30\\x37\\x33\\x5c\\x75\\x30\\x30\\x37\\x34'
仍然发送双反斜杠( \\x5x\\x5c
)
It works fine for me.这对我来说可以。 Tested with https://httpbin.davecheney.com/post
, Python 3.7 and Requests 2.23.0:使用https://httpbin.davecheney.com/post
3.7 和 Requests 2.23.0 进行测试:
import requests, json
url = r"https://httpbin.davecheney.com/post"
data_raw_str = r"\u0074\u0065\u0073\u0074"
s = 'test'
data = ''
for c in s:
data += '\\u00' + hex(ord(c))[2:].lower()
#data += fr"\u{ord(c):04x}" # this works, too
json_dict = {'user': data}
r = requests.post(url, json=json_dict)
print(r)
data_returned = json.loads(r.json()['data'])['user']
print(data_raw_str)
print(data)
print(data_returned)
print(data_raw_str == data == data_returned)
print(requests.__version__)
Output:输出:
<Response [200]>
\u0074\u0065\u0073\u0074
\u0074\u0065\u0073\u0074
\u0074\u0065\u0073\u0074
True
2.23.0
Edit:编辑:
According to RFC 8259 - The JavaScript Object Notation (JSON) Data Interchange Format - 7. Strings:根据RFC 8259 - JavaScript 对象表示法 (JSON) 数据交换格式- 7. 字符串:
All Unicode characters may be placed within the quotation marks, except for the characters that MUST be escaped: quotation mark, reverse solidus , and the control characters (U+0000 through U+001F).所有 Unicode 字符都可以放在引号内,但必须转义的字符除外:引号、反斜杠和控制字符(U+0000 到 U+001F)。
So backslashes will always be escaped with another backslash in JSON.所以反斜杠总是会被 JSON 中的另一个反斜杠转义。
I believe manually removing the extra backslashes will cause the server's JSON decoder to unescape the unicode literals so your string will become plain old test
.我相信手动删除额外的反斜杠会导致服务器的 JSON 解码器对 unicode 文字进行转义,因此您的字符串将变成普通的旧test
。
Why does the request have to be JSON?为什么请求必须是 JSON?
If you make this request, no additional backslashes are added:如果您提出此请求,则不会添加额外的反斜杠:
requests.post(url, data=data) # data is a str
And if you make this request, the keys and values are utf-8
encoded, and then url encoded (the single backslash is replaced with %5C
):如果你提出这个请求,键和值是utf-8
编码的,然后是 url 编码的(单个反斜杠被替换为%5C
):
requests.post(url, data=json_dict)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.