简体   繁体   English

将转义的unicode序列转换为人类可读的格式

[英]convert escaped unicode sequence to human readable format

I've been using this python code: 我一直在使用此python代码:

pattern = u'丨フ丨ノ一丨ノ丶フノ一ノ丨フ一一ノフフ丶'
result = [u'<span id="z_i_t2_bis" title="\u7ad6\u6298\u7ad6\u6487\u6a2a\u7ad6\u6487\u637a\u6298\u6487\u6a2a\u6487\u7ad6\u6298\u6a2a\u6a2a\u6487\u6298\u6298\u637a">\u4e28\u30d5\u4e28\u30ce\u4e00\u4e28\u30ce\u4e36\u30d5\u30ce\u4e00\u30ce\u4e28\u30d5\u4e00\u4e00\u30ce\u30d5\u30d5\u4e36</span>']

if pattern in result[0]:
    print('found')

But this is cumbersome and moreover doesn't really do what I want, which is to get the escaped gobbledygook back into something comprehensible, as in that pattern. 但这很麻烦,而且并没有真正做到我想要的,这就是使逃脱的傻瓜变回可理解的东西,如这种模式。 Is there some simple unix tool or commnand to perform this task quickly and efficiently? 是否有一些简单的unix工具或命令来快速有效地执行此任务?

seems that is one would work , but I tried it and it did not. 似乎这是一个可行的方法 ,但我尝试了,但没有成功。 ie,

result = "\u4e28\u30d5\u4e28\u30ce\u4e00\u4e28\u30ce\u4e36\u30d5\u30ce\u4e00\u30ce\u4e28\u30d5\u4e00\u4e00\u30ce\u30d5\u30d5\u4e36"

result.decode('utf-8')

which generated the error: attribute error 'str' object has no attribute 'decode' 哪个生成错误: attribute error 'str' object has no attribute 'decode'

If you simply print(result) then you'll get the "gobbledygook", because that's what Python uses when it gives you an unambiguous output as an element of a list or tuple. 如果仅print(result)那么您将获得“ gobbledygook”,因为这就是Python在为您提供明确输出作为列表或元组元素时所使用的方式。 But if you print the string directly, print(result[0]) , it will try to print the natural characters as they were intended. 但是,如果直接打印字符串print(result[0]) ,它将尝试按原样打印自然字符。

If you want to convert the characters to utf-8 yourself, use encode rather than decode . 如果你想给自己的字符转换为UTF-8,使用encode而不是decode encode converts a Unicode string to bytes, decode produces a Unicode string from bytes. encode将Unicode字符串转换为字节, decode根据字节生成Unicode字符串。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM