简体   繁体   中英

UnicodeDecodeError: 'utf8' codec can't decode byte 0x92 in position 377826: invalid start byte

I am getting the following error while executing the below code snippet exactly at the line if uID in repo.git.log(): , the problem is in repo.git.log() , I have looked at all the similar questions on Stack Overflow which suggests to use decode("utf-8") .

how do I convert repo.git.log() into decode("utf-8") ?

UnicodeDecodeError: 'utf8' codec can't decode byte 0x92 in position 377826: invalid start byte 

Relavant code:

..................
uID = gerritInfo['id'].decode("utf-8")                                            
if uID in repo.git.log():
        inwslist.append(gerritpatch)      
.....................


Traceback (most recent call last):
  File "/prj/host_script/script.py", line 1417, in <module>
    result=main()
  File "/prj/host_script/script.py", line 1028, in main
    if uID in repo.git.log():
  File "/usr/local/lib/python2.7/dist-packages/git/cmd.py", line 431, in <lambda>
    return lambda *args, **kwargs: self._call_process(name, *args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/git/cmd.py", line 802, in _call_process
    return self.execute(make_call(), **_kwargs)
  File "/usr/local/lib/python2.7/dist-packages/git/cmd.py", line 610, in execute
    stdout_value = stdout_value.decode(defenc)
  File "/usr/lib/python2.7/encodings/utf_8.py", line 16, in decode
    return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode byte 0x92 in position 377826: invalid start byte

0x92 is a smart quote(') of Windows-1252. It simply doesn't exist in unicode, therefore it can't be decoded.

Maybe your file was edited by a Windows machine which basically caused this problem?

使用encoding='cp1252'将解决此问题。

After good research, I got the solution. In my case, datadump.json<\/code><\/strong> file was having the issue.

  • <\/li>
  • <\/li>
  • <\/li>
  • <\/li><\/ul>

    Now you can try running the command. You are good to go :)

0x92 does not exist in the encoding UTF-8. As Exceen stated in his answer 0x92 is used in Windows-1252 as a smart quote. The way to resolve this is to use the windows 1252 encoding or to update the smart quote to a normal quote.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM