简体   繁体   中英

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe2 in position 434852: invalid continuation byte

I am using hfcca to calculate cyclomatic complexity for a c++ code. hfcca is a simple python script ( https://code.google.com/p/headerfile-free-cyclomatic-complexity-analyzer/ ). When i am trying to run the script to generate the output in the form of an xml file i am getting following errors :

Traceback (most recent call last):
    "./hfcca.py", line 802, in <module>
    main(sys.argv[1:])
    File "./hfcca.py", line 798, in main
    print(xml_output([f for f in r], options))
    File "./hfcca.py", line 798, in <listcomp>
    print(xml_output([f for f in r], options))
    File "/x/home06/smanchukonda/PREFIX/lib/python3.3/multiprocessing/pool.py", line 652, in next
    raise value
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe2 in position 434852: invalid continuation byte

Please help me with this..

The problem looks like the file has characters represented with latin1 that aren't characters in utf8. The file utility can be useful for figuring out what encoding a file should be treated as, eg:

monk@monk-VirtualBox:~$ file foo.txt 
foo.txt: UTF-8 Unicode text

Here's what the bytes mean in latin1:

>>> b'\xe2'.decode('latin1')
'â'

Probably easiest is to convert the files to utf8.

I also had the same problem rendering Markup("""yyyyyy""") but i solved it using an online tool with removed the 'bad' characters. https://pteo.paranoiaworks.mobi/diacriticsremover/

It is a nice tool and works even offline.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM