![](/img/trans.png)
[英]UnicodeDecodeError: 'utf-8' codec can't decode byte 0x96 in position 35: invalid start byte
[英]Why am I getting SyntaxError: (unicode error) 'utf-8' codec can't decode byte 0x96 in position 0: invalid start byte
我從API獲得了一些json數據。 我使用了json.loads,然后將其打印到REPL,如下所示。
{'warnings': {'query': {'*': "Formatting of continuation data will be changing soon. To continue using the current formatting, use the 'rawcontinue' parameter. To begin using the new format, pass an empty string for 'continue' in the initial query."}}, 'query-continue': {'links': {'plcontinue': '25618423|10|R_from_other_capitalisation', 'gplcontinue': "15095968|0|1991_US_Open_-_Women's_Doubles"}}, 'query': {'pages': {'32203010': {'pageid': 32203010, 'title': "1988 Australian Open - Women's Doubles", 'ns': 0}, '25618558': {'pageid': 25618558, 'title': "1984 Wimbledon Championships - Women's Singles", 'ns': 0}, '29486043': {'pageid': 29486043, 'title': "1984 Wimbledon Championships - Women's Doubles", 'ns': 0}, '25618819': {'pageid': 25618819, 'title': "1986 US Open - Women's Singles", 'ns': 0}, '25619314': {'pageid': 25619314, 'title': "1989 US Open - Women's Singles", 'ns': 0}, '25618668': {'pageid': 25618668, 'title': "1985 US Open - Women's Singles", 'ns': 0}, '25618857': {'pageid': 25618857, 'title': "1987 Australian Open - Women's Singles", 'ns': 0}, '25618423': {'links': [{'title': "1983 Wimbledon Championships – Women's Singles", 'ns': 0}, {'title': 'Wikipedia:Mainspace', 'ns': 4}, {'title': 'Template:R from long name', 'ns': 10}], 'pageid': 25618423, 'title': "1983 Wimbledon Championships - Women's Singles", 'ns': 0}, '23826062': {'links': [{'title': "1984 French Open – Women's Singles", 'ns': 0}, {'title': 'Wikipedia:Mainspace', 'ns': 4}, {'title': 'Template:R from long name', 'ns': 10}, {'title': 'Template:R from other capitalisation', 'ns': 10}, {'title': 'Template:R from plural', 'ns': 10}, {'title': 'Template:R from short name', 'ns': 10}, {'title': 'Category:Redirects from modifications', 'ns': 14}], 'pageid': 23826062, 'title': "1984 French Open - Women's Singles", 'ns': 0}, '25619177': {'pageid': 25619177, 'title': "1989 Australian Open - Women's Singles", 'ns': 0}}}}
然后,我將該數據從repl復制到.py模塊並分配給變量,以便執行一些單元測試。 但我不斷收到此錯誤:
SyntaxError: (unicode error) 'utf-8' codec can't decode byte 0x96 in position 0: invalid start byte
到底是怎么回事?
更新:我得到錯誤的確切方法。 使用Visual Studio,我運行了一個腳本,該腳本使用Requests和.text獲取數據以獲取內容。 然后,我應用了json.loads。 我將其打印到Visual Studio Python 3.4 Interactive(又名REPL)。 然后,我使用鼠標從此REPL復制並粘貼到Visual Studio中的.py文件中。
更新2:因此,當我獲取數據時,我使用Requests,然后使用text屬性。 當我打印不帶json.loads的時候很好。 但是,如果我從REPL復制此“更多原始數據”並不再粘貼它不再是字符串,而是粘貼對象和JSON,將無法正常工作。 python 3打印功能是否可以打印對象,即使它應該是json?
這是使用Requests.text從API輸出的原始json.loads:
{"warnings":{"query":{"*":"Formatting of continuation data will be changing soon. To continue using the current formatting, use the 'rawcontinue' parameter. To begin using the new format, pass an empty string for 'continue' in the initial query."}},"query-continue":{"links":{"plcontinue":"25618423|10|R_from_other_capitalisation","gplcontinue":"15095968|0|1991_US_Open_-_Women's_Doubles"}},"query":{"pages":{"25618423":{"pageid":25618423,"ns":0,"title":"1983 Wimbledon Championships - Women's Singles","links":[{"ns":0,"title":"1983 Wimbledon Championships \u2013 Women's Singles"},{"ns":4,"title":"Wikipedia:Mainspace"},{"ns":10,"title":"Template:R from long name"}]},"23826062":{"pageid":23826062,"ns":0,"title":"1984 French Open - Women's Singles","links":[{"ns":0,"title":"1984 French Open \u2013 Women's Singles"},{"ns":4,"title":"Wikipedia:Mainspace"},{"ns":10,"title":"Template:R from long name"},{"ns":10,"title":"Template:R from other capitalisation"},{"ns":10,"title":"Template:R from plural"},{"ns":10,"title":"Template:R from short name"},{"ns":14,"title":"Category:Redirects from modifications"}]},"29486043":{"pageid":29486043,"ns":0,"title":"1984 Wimbledon Championships - Women's Doubles"},"25618558":{"pageid":25618558,"ns":0,"title":"1984 Wimbledon Championships - Women's Singles"},"25618668":{"pageid":25618668,"ns":0,"title":"1985 US Open - Women's Singles"},"25618819":{"pageid":25618819,"ns":0,"title":"1986 US Open - Women's Singles"},"25618857":{"pageid":25618857,"ns":0,"title":"1987 Australian Open - Women's Singles"},"32203010":{"pageid":32203010,"ns":0,"title":"1988 Australian Open - Women's Doubles"},"25619177":{"pageid":25619177,"ns":0,"title":"1989 Australian Open - Women's Singles"},"25619314":{"pageid":25619314,"ns":0,"title":"1989 US Open - Women's Singles"}}}}
文本中包含EN DASH
(U + 2013)字符。 在Windows-1252
編解碼器中,它們映射到字節\\x96
。 您遇到了編碼問題,但是確切的原因取決於您將文本復制到.py
文件的步驟。 我將問題中的文本剪切並粘貼到Notepad ++中,並將編碼設置為ANSI
並將其分配給變量,然后得到:
File "C:\temp.py", line 1
SyntaxError: unknown decode error
但是,選擇UTF-8 without BOM
UTF-8
或UTF-8 without BOM
作為編碼可以正常工作。 如果沒有#coding:
聲明源編碼的注釋,Python 3將假定為UTF-8。
請注意,我的美國Windows系統上的ANSI
實際上是Windows-1252
。 使用ANSI
並添加#coding:windows-1252
也可以正常工作。 如果Python編碼與默認編碼不同(Python 2上的ascii
和Python 3上的utf-8
),則Python需要知道源編碼。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.