如何将unicode转义序列URL转换为python unicode？

Question

what is the right way to do it if the URL has some unicode chars in it, and is escaped in the client side using javascript ( escape(text) )? 如果网址中包含一些Unicode字符，并且使用javascript（escape（text））在客户端进行转义，那么正确的方法是什么？ For example, if my url is: domain.com/?text=%u05D0%u05D9%u05DA%20%u05DE%u05DE%u05D9%u05E8%u05D9%u05DD%20%u05D0%u05EA%20%u05D4%u05D8%u05E7%u05E1%u05D8%20%u05D4%u05D6%u05D4 例如，如果我的网址是：domain.com/?text=%u05D0%u05D9%u05DA%20%u05DE%u05DE%u05D9%u05E8%u05D9%u05DD%20%u05D0%u05EA%20%u05D4%u05D8% u05E1％u05D8％20％u05D4％u05D6％u05D4

I tried: text = urllib.unquote(request.GET.get('text')) but I got the exact same string back (%u05D0%u05D9%u05DA%20%u05DE ... ) 我试过了：text = urllib.unquote（request.GET.get（'text'））但我得到了完全相同的字符串（％u05D0％u05D9％u05DA％20％u05DE ...）

Answer 1

eventually what I did is changed the client side from escape(text) to urlEncodeComponent(text) and then in the python side used: 最终，我所做的是将客户端从escape（text）更改为urlEncodeComponent（text），然后在python端使用了：

request.encoding = 'UTF-8' text = unicode(request.GET.get('text', None)) request.encoding ='UTF-8'text = unicode（request.GET.get（'text'，None））

Not sure this is the best thing to do, but it works in English and Hebrew 不确定这是最好的做法，但是它可以用英语和希伯来语工作

Answer 2

Because your %uxxxx is not Python-standard, which is \\uxxxx, you need a tricky transform to replace '%' with '\\', like following(tested in my Python shell): 由于您的％uxxxx不是Python标准的\\ uxxxx，因此您需要进行棘手的转换才能将'％'替换为'\\'，如下所示（在我的Python shell中测试）：

>>> import sys; reload(sys); sys.setdefaultencoding('utf8')
<module 'sys' (built-in)>
>>> text = '%u05D0%u05D9%u05DA%20%u05DE%u05DE%u05D9%u05E8%u05D9%u05DD%20%u05D0%u05EA%20%u05D4%u05D8%u05E7%u05E1%u05D8%20%u05D4%u05D6%u05D4'
>>> text = text.replace('%', '\\')
>>> text_u = text.decode('unicode-escape')
>>> print text_u
איךממיריםאתהטקסטהזה

After transformed into Unicode type, You can then transform it to whatever encoding you like, as following: 转换为Unicode类型后，您可以将其转换为所需的任何编码，如下所示：

>>> text_utf8 = text_u.encode('utf8')
>>> text_utf8
'\xd7\x90\xd7\x99\xd7\x9a\x10\xd7\x9e\xd7\x9e\xd7\x99\xd7\xa8\xd7\x99\xd7\x9d\x10\xd7\x90\xd7\xaa\x10\xd7\x94\xd7\x98\xd7\xa7\xd7\xa1\xd7\x98\x10\xd7\x94\xd7\x96\xd7\x94'
>>> print text_utf8
איךממיריםאתהטקסטהזה

如何将unicode转义序列URL转换为python unicode？

问题描述

2 个解决方案

解决方案1
3 2010-12-22 20:13:30

解决方案2
0 2015-07-30 15:26:25

如何将unicode转义序列URL转换为python unicode？

问题描述

2 个解决方案

解决方案1 3 2010-12-22 20:13:30

解决方案2 0 2015-07-30 15:26:25

解决方案1
3 2010-12-22 20:13:30

解决方案2
0 2015-07-30 15:26:25