Convert string to utf-16

Question

I have a text file with japanese characters. I read a line from it and want to convert it to utf-16 specifically. How can I do it using Python? My code looks like this -

with open("C:\\Users\\badri\\jap.txt", 'rb') as f:
    for line in f:
        u = line.decode(encoding='utf-16',errors='strict')

I get this error "LookupError: unknown encoding: utf-16"

The reason is I want it in utf-16 is because words are separated by spaces and so doesn't matter what language the text file is in. I would be able to use space as a delimiter and count the number of words in the file.

Once separated, I can easily print them this way -

u1 = u'\u0048\u0065\u006c\u006c\u006f'
u2 = u'\u0077\u006f\u0072\u006c\u0064'
u3 = u'\u3053\u3093\u306b\u3061\u306f\u4e16\u754c'
print u1
print u2
print u3

Hello
world
こんにちは世界

Answer 1

This depends entirely on the encoding of the file.

Either way, you need to decode the line first, and then re-encode it so that it's utf-16.

with open(file_path, "r") as fh:
    for line in fh:
        string = line.decode("utf-8").encode("utf-16")

Convert string to utf-16

Question

1 answers

solution1
0 ACCPTED 2018-05-16 22:12:33

Convert string to utf-16

Question

1 answers

solution1 0 ACCPTED 2018-05-16 22:12:33

solution1
0 ACCPTED 2018-05-16 22:12:33