urllib.unquote_plus（s）不会将加号转换为空格

Question

从文件中，urllib.unquote_plus应该按空格重复加号。 但是当我在IDLE中为python 2.7尝试下面的代码时，却没有。

>>s = 'http://stackoverflow.com/questions/?q1=xx%2Bxx%2Bxx'
>>urllib.unquote_plus(s)
>>'http://stackoverflow.com/questions/?q1=xx+xx+xx'

我也尝试过像urllib.unquote_plus(s).decode('utf-8').这样的东西urllib.unquote_plus(s).decode('utf-8'). 有没有适当的解码网址组件？

Answer 1

%2B是文字 +的转义码; 它完全没有被正确转移。

不要将此与URL转义 +混淆，后者是空格的转义字符：

>>> s = 'http://stackoverflow.com/questions/?q1=xx+xx+xx'
>>> urllib.unquote_plus(s)
'http://stackoverflow.com/questions/?q1=xx xx xx'

unquote_plus()仅将编码空格解码为文字空格（ '+' - > ' ' ），而非编码+符号（ '%2B' - > '+' ）。

如果您有使用%2B而不是+预期空间的解码输入，那么这些输入值可能会被双重引用，您需要将它们取消引用两次。 你会看到% escapes编码：

>>> urllib.quote_plus('Hello world!')
'Hello+world%21'
>>> urllib.quote_plus(urllib.quote_plus('Hello world!'))
'Hello%2Bworld%2521'

其中%25是引用的%字符。

Answer 2

那些不是空格，那些是实际的优点。 空格是％20，在URL的那部分确实等于+，但％2B表示文字加。