[英]How to convert utf-8 characters to "normal" characters in string in python3.10?
I have raw data that looks like this:我有如下所示的原始数据:
25023,Zwerg+M%C3%BCtze,0,1,986,3780
25023,Zwerg+M%C3%BCtze,0,1,986,3780
25871,red+earth,0,1,38,834925871,红+土,0,1,38,8349
25931,K4m%21k4z3,90,1,1539,253025931,K4m%21k4z3,90,1,1539,2530
It is saved as a .txt file: https://de205.die-staemme.de/map/player.txt它保存为 .txt 文件: https ://de205.die-staemme.de/map/player.txt
The "characters" starting with % are unicode, as far as I can tell.据我所知,以 % 开头的“字符”是 unicode。
I found the following table about it: https://www.i18nqa.com/debug/utf8-debug.html我找到了下表: https ://www.i18nqa.com/debug/utf8-debug.html
Here is my code so far:到目前为止,这是我的代码:
urllib.urlretrieve(url,pfad + "player.txt")
f = open(pfad + "player.txt","r",encoding="utf-8")
raw = raw.split("\n")
f.close()
Python does not convert the %-characters. Python 不会转换 % 字符。 They are read as if they were seperate characters.
它们被视为单独的字符。
Is there a way to convert these characters without calling .replace like 200 times?有没有办法在不调用 .replace 200 次的情况下转换这些字符?
Thank you very much in advance for help and/or useful hints!非常感谢您提前提供帮助和/或有用的提示!
The %s are URL-encoding; %s 是 URL 编码; use
urllib.parse.unquote
to decode the string.使用
urllib.parse.unquote
解码字符串。
>>> raw = """25023,Zwerg+M%C3%BCtze,0,1,986,3780
... 25871,red+earth,0,1,38,8349
... 25931,K4m%21k4z3,90,1,1539,2530"""
>>> import urllib.parse
>>> print(urllib.parse.unquote(raw))
25023,Zwerg+Mütze,0,1,986,3780
25871,red+earth,0,1,38,8349
25931,K4m!k4z3,90,1,1539,2530
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.