简体   繁体   中英

How do I decode Hangul in utf-8?

I opened the file with utf-8, but if the file contains Korean, it cannot be decoded. What should I do?

generated_file = open("runCode.py", "w", encoding='utf-8')
outputData = subprocess.check_output("python runCode.py", shell = True)
outputData = outputData.decode('utf-8')

example

b'20\xba\xb8\xb4\xd9 \xc0\xdb\xc0\xbd\r\n'

The bytes in the question can be decoded with a number of encodings in the standard encodings listed in the Python codecs documentation .

>>> bs = b'20\xba\xb8\xb4\xd9 \xc0\xdb\xc0\xbd\r\n'
>>> print(bs.decode('cp949'))
20보다 작음

>>> print(bs.decode('euc_kr'))
20보다 작음

>>> print(bs.decode('johab'))
20줮얯 첕챻

Note that the outputs are not always the same - bytes may encode different characters in different encodings. You may need to experiment on larger samples to determine which encoding is the one being used in your environment.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM