Javascript unescape（）与Python urllib.unquote（）

Question

From reading various posts, it seems like JavaScript's unescape() is equivalent to Pythons urllib.unquote() , however when I test both I get different results: 通过阅读各种帖子，似乎JavaScript的unescape()等同于Pythons urllib.unquote() ，但是当我测试两者时，我会得到不同的结果：

In browser console: 在浏览器控制台：

unescape('%u003c%u0062%u0072%u003e');

output: <br> 输出： <br>

In Python interpreter: 在Python解释器中：

import urllib
urllib.unquote('%u003c%u0062%u0072%u003e')

output: %u003c%u0062%u0072%u003e 输出： %u003c%u0062%u0072%u003e

I would expect Python to also return <br> . 我希望Python也能返回<br> 。 Any ideas as to what I'm missing here? 关于我在这里缺少什么的想法？

Thanks! 谢谢！

Answer 1

%uxxxx is a non standard URL encoding scheme that is not supported by urllib.parse.unquote() (Py 3) / urllib.unquote() (Py 2). %uxxxx是urllib.parse.unquote() （Py 3）/ urllib.unquote() （Py 2）不支持的非标准URL编码方案。

It was only ever part of ECMAScript ECMA-262 3rd edition; 它只是ECMAScript ECMA-262第3版的一部分; the format was rejected by the W3C and was never a part of an RFC. 格式被W3C拒绝，并且从未成为RFC的一部分。

You could use a regular expression to convert such codepoints: 您可以使用正则表达式来转换此类代码点：

try:
    unichr  # only in Python 2
except NameError:
    unichr = chr  # Python 3

re.sub(r'%u([a-fA-F0-9]{4}|[a-fA-F0-9]{2})', lambda m: unichr(int(m.group(1), 16)), quoted)

This decodes both the %uxxxx and the %uxx form ECMAScript 3rd ed can decode. 这解码了%uxxxx和%uxx形式ECMAScript 3rd ed可以解码。

Demo: 演示：

>>> import re
>>> quoted = '%u003c%u0062%u0072%u003e'
>>> re.sub(r'%u([a-fA-F0-9]{4}|[a-fA-F0-9]{2})', lambda m: chr(int(m.group(1), 16)), quoted)
'<br>'
>>> altquoted = '%u3c%u0062%u0072%u3e'
>>> re.sub(r'%u([a-fA-F0-9]{4}|[a-fA-F0-9]{2})', lambda m: chr(int(m.group(1), 16)), altquoted)
'<br>'

but you should avoid using the encoding altogether if possible. 但是如果可能的话，你应该完全避免使用编码。

Javascript unescape（）与Python urllib.unquote（）

问题描述

In browser console: 在浏览器控制台：

In Python interpreter: 在Python解释器中：

1 个解决方案

解决方案1
9 已采纳 2014-04-18 17:15:19

Javascript unescape（）与Python urllib.unquote（）

问题描述

In browser console: 在浏览器控制台：

In Python interpreter: 在Python解释器中：

1 个解决方案

解决方案1 9 已采纳 2014-04-18 17:15:19

解决方案1
9 已采纳 2014-04-18 17:15:19