How do I decode Hangul in utf-8?

Question

I opened the file with utf-8, but if the file contains Korean, it cannot be decoded. What should I do?

generated_file = open("runCode.py", "w", encoding='utf-8')
outputData = subprocess.check_output("python runCode.py", shell = True)
outputData = outputData.decode('utf-8')

example

b'20\xba\xb8\xb4\xd9 \xc0\xdb\xc0\xbd\r\n'

Answer 1

The bytes in the question can be decoded with a number of encodings in the standard encodings listed in the Python codecs documentation .

>>> bs = b'20\xba\xb8\xb4\xd9 \xc0\xdb\xc0\xbd\r\n'
>>> print(bs.decode('cp949'))
20보다 작음

>>> print(bs.decode('euc_kr'))
20보다 작음

>>> print(bs.decode('johab'))
20줮얯 첕챻

Note that the outputs are not always the same - bytes may encode different characters in different encodings. You may need to experiment on larger samples to determine which encoding is the one being used in your environment.

How do I decode Hangul in utf-8?

Question

1 answers

solution1
0 ACCPTED 2020-09-13 16:48:47

How do I decode Hangul in utf-8?

Question

1 answers

solution1 0 ACCPTED 2020-09-13 16:48:47

solution1
0 ACCPTED 2020-09-13 16:48:47