简体   繁体   中英

How do i parse raw data from Flask's request.get_data() to Chinese characters?

I'm using Flask to build a web server handling some Chinese requests via POST method. Originally, I'm thinking of using request.form['body'] to get the content, however, because of the client-side encoding is in BIG5 , somehow returned values from Flask.request.form is always decoded using UTF-8 , so i have to use request.get_data() to retrieve raw data from the request and decode it myself.

But the weird thing is that when the enctype = multipart/form-data everything is fine that i can use request.get_data().decode('big5') to get the correct characters, but when i don't specified enctype which will use application/x-www-form-urlencoded by default, the returned value like below:

Result 1.

%B6W%C3%D9%A4u%B5%7B%A6%B3%AD%AD%A4%BD%A5q

which is not 'BIG5' encoded, the original text should look like below:

Result 2.

超贊工程有限公司

'BIG5' encoded one should like below:

Result 3.

xb6W\xc3\xd9\xa4u\xb5{\xa6\xb3\xad\xad\xa4\xbd\xa5q

My question is how can i properly decode form data from Result1 to Result2 when using application/x-www-form-urlencoded ?

Code and result if content-type eqauls to application/x-www-form-urlencoded as below: 在此处输入图片说明

Code and result if content-type eqauls to multipart/form-data as below: 在此处输入图片说明

You're getting an URL-encoded string. Use urllib to decode it:

import urllib
data = '%B6W%C3%D9%A4u%B5%7B%A6%B3%AD%AD%A4%BD%A5q'
print(urllib.parse.unquote(data, encoding='big5'))

This prints 超贊工程有限公司 , which looks like your expected output.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM