将字符串转换为utf-16

Question

I have a text file with japanese characters. 我有一个带有日语字符的文本文件。 I read a line from it and want to convert it to utf-16 specifically. 我从中读取了一行，并希望将其专门转换为utf-16。 How can I do it using Python? 如何使用Python做到这一点？ My code looks like this - 我的代码看起来像这样-

with open("C:\\Users\\badri\\jap.txt", 'rb') as f:
    for line in f:
        u = line.decode(encoding='utf-16',errors='strict')

I get this error "LookupError: unknown encoding: utf-16" 我收到此错误“ LookupError：未知编码：utf-16”

The reason is I want it in utf-16 is because words are separated by spaces and so doesn't matter what language the text file is in. I would be able to use space as a delimiter and count the number of words in the file. 原因是我希望在utf-16中使用它是因为单词之间用空格隔开，所以文本文件所使用的语言无关紧要。我将能够使用空格作为分隔符并计算文件中单词的数量。

Once separated, I can easily print them this way - 分离后，我可以轻松地以这种方式打印它们-

u1 = u'\u0048\u0065\u006c\u006c\u006f'
u2 = u'\u0077\u006f\u0072\u006c\u0064'
u3 = u'\u3053\u3093\u306b\u3061\u306f\u4e16\u754c'
print u1
print u2
print u3

Hello
world
こんにちは世界

Answer 1

This depends entirely on the encoding of the file. 这完全取决于文件的编码。

Either way, you need to decode the line first, and then re-encode it so that it's utf-16. 无论哪种方式，您都需要先解码该行，然后再对其进行编码，以使其为utf-16。

with open(file_path, "r") as fh:
    for line in fh:
        string = line.decode("utf-8").encode("utf-16")

将字符串转换为utf-16

问题描述

1 个解决方案

解决方案1
0 已采纳 2018-05-16 22:12:33

将字符串转换为utf-16

问题描述

1 个解决方案

解决方案1 0 已采纳 2018-05-16 22:12:33

解决方案1
0 已采纳 2018-05-16 22:12:33