简体   繁体   中英

Is there a function in python to convert html entities to percent encoding?

I am retrieving Japanese and Chinese text from a website in the form of JSON using urllib2 and converting them to HTML entities using encode(xmlcharrefreplace).

I then use curl to post the same content(after making minor changes) back on the website using percent encoding. My code works fine for English text with special characters, but I need to convert all Japanese/Chinese characters from html encoding to percent encoding.

Is there a function in Python which could do this magic?

PS: For English text, I have my own function to convert special chars to percent encoding. I cannot use this method for the Japanese/Chinese characters as there are too many of them.

You want to combine two things:

  1. HTML decoding
  2. URL encoding

Here is an example (Python3):

>>> import html
>>> html.unescape('{')
'{'
>>> import urllib.parse
>>> urllib.parse.quote('{')
'%7B'

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM