[英]Hello, i have a text file that i want to write in UTF-16 unicode using python
例如我从文件 1 中读取:
ch="Hello world, this a stackoverflow example"
我在文件 2 中写入了 unicode UTF-16,输出必须是这样的:
output="\u0048\u0065\u006c\u006c\u006f \u0077\u006f\u0072\u006c\u0064\u002c \u0074\u0068\u0069\u0073 \u0061 \u0073\u0074\u0061\u0063\u006b\u006f\u0076\u0065\u0072 \u0066\u006c\u006f\u0077 \u0065\u0078\u0061\u006d\u0070\u006c\u0065"
我找到了如何转换或阅读,但没有找到如何转换
open
输出文件时只需传递encoding="utf-16"
:
ch="Hello world, this a stackoverflow example"
with open("utf_16.txt", "w", encoding="utf-16") as f:
f.write(ch)
$ file utf_16.txt
utf_16.txt: Little-endian UTF-16 Unicode text, with no line terminators
$ hexdump -Cv utf_16.txt
00000000 ff fe 48 00 65 00 6c 00 6c 00 6f 00 20 00 77 00 |..H.e.l.l.o. .w.|
00000010 6f 00 72 00 6c 00 64 00 2c 00 20 00 74 00 68 00 |o.r.l.d.,.
.t.h.|
...
请注意, utf-16
编码包括字节顺序标记 (BOM)。 如果您不想要这样,请在编码名称中包含字节序(例如utf-16le
):
ch="Hello world, this a stackoverflow example"
with open("utf_16.txt", "w", encoding="utf-16le") as f:
f.write(ch)
$ file utf_16.txt
utf_16.txt: data
$ hexdump -Cv utf_16.txt
00000000 48 00 65 00 6c 00 6c 00 6f 00 20 00 77 00 6f 00 |H.e.l.l.o. .w.o.|
00000010 72 00 6c 00 64 00 2c 00 20 00 74 00 68 00 69 00 |r.l.d.,. .t.h.i.|
...
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.