簡體   English   中英

為什么我收到SyntaxError:(Unicode錯誤)“ utf-8”編解碼器無法解碼位置0的字節0x96:無效的起始字節

[英]Why am I getting SyntaxError: (unicode error) 'utf-8' codec can't decode byte 0x96 in position 0: invalid start byte

我從API獲得了一些json數據。 我使用了json.loads,然后將其打印到REPL,如下所示。

  {'warnings': {'query': {'*': "Formatting of continuation data will be changing soon. To continue using the current formatting, use the 'rawcontinue' parameter. To begin using the new format, pass an empty string for 'continue' in the initial query."}}, 'query-continue': {'links': {'plcontinue': '25618423|10|R_from_other_capitalisation', 'gplcontinue': "15095968|0|1991_US_Open_-_Women's_Doubles"}}, 'query': {'pages': {'32203010': {'pageid': 32203010, 'title': "1988 Australian Open - Women's Doubles", 'ns': 0}, '25618558': {'pageid': 25618558, 'title': "1984 Wimbledon Championships - Women's Singles", 'ns': 0}, '29486043': {'pageid': 29486043, 'title': "1984 Wimbledon Championships - Women's Doubles", 'ns': 0}, '25618819': {'pageid': 25618819, 'title': "1986 US Open - Women's Singles", 'ns': 0}, '25619314': {'pageid': 25619314, 'title': "1989 US Open - Women's Singles", 'ns': 0}, '25618668': {'pageid': 25618668, 'title': "1985 US Open - Women's Singles", 'ns': 0}, '25618857': {'pageid': 25618857, 'title': "1987 Australian Open - Women's Singles", 'ns': 0}, '25618423': {'links': [{'title': "1983 Wimbledon Championships – Women's Singles", 'ns': 0}, {'title': 'Wikipedia:Mainspace', 'ns': 4}, {'title': 'Template:R from long name', 'ns': 10}], 'pageid': 25618423, 'title': "1983 Wimbledon Championships - Women's Singles", 'ns': 0}, '23826062': {'links': [{'title': "1984 French Open – Women's Singles", 'ns': 0}, {'title': 'Wikipedia:Mainspace', 'ns': 4}, {'title': 'Template:R from long name', 'ns': 10}, {'title': 'Template:R from other capitalisation', 'ns': 10}, {'title': 'Template:R from plural', 'ns': 10}, {'title': 'Template:R from short name', 'ns': 10}, {'title': 'Category:Redirects from modifications', 'ns': 14}], 'pageid': 23826062, 'title': "1984 French Open - Women's Singles", 'ns': 0}, '25619177': {'pageid': 25619177, 'title': "1989 Australian Open - Women's Singles", 'ns': 0}}}}

然后,我將該數據從repl復制到.py模塊並分配給變量,以便執行一些單元測試。 但我不斷收到此錯誤:

SyntaxError: (unicode error) 'utf-8' codec can't decode byte 0x96 in position 0: invalid start byte

到底是怎么回事?

更新:我得到錯誤的確切方法。 使用Visual Studio,我運行了一個腳本,該腳本使用Requests和.text獲取數據以獲取內容。 然后,我應用了json.loads。 我將其打印到Visual Studio Python 3.4 Interactive(又名REPL)。 然后,我使用鼠標從此REPL復制並粘貼到Visual Studio中的.py文件中。

更新2:因此,當我獲取數據時,我使用Requests,然后使用text屬性。 當我打印不帶json.loads的時候很好。 但是,如果我從REPL復制此“更多原始數據”並不再粘貼它不再是字符串,而是粘貼對象和JSON,將無法正常工作。 python 3打印功能是否可以打印對象,即使它應該是json?

這是使用Requests.text從API輸出的原始json.loads:

{"warnings":{"query":{"*":"Formatting of continuation data will be changing soon. To continue using the current formatting, use the 'rawcontinue' parameter. To begin using the new format, pass an empty string for 'continue' in the initial query."}},"query-continue":{"links":{"plcontinue":"25618423|10|R_from_other_capitalisation","gplcontinue":"15095968|0|1991_US_Open_-_Women's_Doubles"}},"query":{"pages":{"25618423":{"pageid":25618423,"ns":0,"title":"1983 Wimbledon Championships - Women's Singles","links":[{"ns":0,"title":"1983 Wimbledon Championships \u2013 Women's Singles"},{"ns":4,"title":"Wikipedia:Mainspace"},{"ns":10,"title":"Template:R from long name"}]},"23826062":{"pageid":23826062,"ns":0,"title":"1984 French Open - Women's Singles","links":[{"ns":0,"title":"1984 French Open \u2013 Women's Singles"},{"ns":4,"title":"Wikipedia:Mainspace"},{"ns":10,"title":"Template:R from long name"},{"ns":10,"title":"Template:R from other capitalisation"},{"ns":10,"title":"Template:R from plural"},{"ns":10,"title":"Template:R from short name"},{"ns":14,"title":"Category:Redirects from modifications"}]},"29486043":{"pageid":29486043,"ns":0,"title":"1984 Wimbledon Championships - Women's Doubles"},"25618558":{"pageid":25618558,"ns":0,"title":"1984 Wimbledon Championships - Women's Singles"},"25618668":{"pageid":25618668,"ns":0,"title":"1985 US Open - Women's Singles"},"25618819":{"pageid":25618819,"ns":0,"title":"1986 US Open - Women's Singles"},"25618857":{"pageid":25618857,"ns":0,"title":"1987 Australian Open - Women's Singles"},"32203010":{"pageid":32203010,"ns":0,"title":"1988 Australian Open - Women's Doubles"},"25619177":{"pageid":25619177,"ns":0,"title":"1989 Australian Open - Women's Singles"},"25619314":{"pageid":25619314,"ns":0,"title":"1989 US Open - Women's Singles"}}}}

文本中包含EN DASH (U + 2013)字符。 Windows-1252編解碼器中,它們映射到字節\\x96 您遇到了編碼問題,但是確切的原因取決於您將文本復制到.py文件的步驟。 我將問題中的文本剪切並粘貼到Notepad ++中,並將編碼設置為ANSI並將其分配給變量,然后得到:

  File "C:\temp.py", line 1
SyntaxError: unknown decode error

但是,選擇UTF-8 without BOM UTF-8UTF-8 without BOM作為編碼可以正常工作。 如果沒有#coding:聲明源編碼的注釋,Python 3將假定為UTF-8。

請注意,我的美國Windows系統上的ANSI實際上是Windows-1252 使用ANSI並添加#coding:windows-1252也可以正常工作。 如果Python編碼與默認編碼不同(Python 2上的ascii和Python 3上的utf-8 ),則Python需要知道源編碼。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM