简体   繁体   English

Python程序不会读取16位字符的txt文件

[英]Python program will not read a txt file in 16bit characters

My question is how I get python to read a file where the text is in 16bit characters. 我的问题是如何让python读取文本为16位字符的文件。 The rest of the post describes the situation. 文章的其余部分描述了这种情况。

I have a text file which is a playlist export from iTunes. 我有一个文本文件,它是从iTunes导出的播放列表。 Here is a short section including the header 这是包含标题的一小段

Name    Artist  Composer    Album   Grouping    Work    Movement Number Movement Count  Movement Name   Genre   Size    Time    Disc Number Disc Count  Track Number    Track Count Year    Date Modified   Date Added  Bit Rate    Sample Rate Volume Adjustment   Kind    Equalizer   Comments    Plays   Last Played Skips   Last Skipped    My Rating
Keyboard Works of the Masters   Randolph Hokanson       Pan125b                         2054816 64                      03/11/2017, 18:00   03/11/2017, 17:01   256 44100       MPEG audio file         1   03/11/2017, 17:02   4   08/03/2018, 16:07   
08 Traccia 08                                       11159905    464                     03/11/2017, 17:39   03/11/2017, 16:59   192 48000       MPEG audio file                 1   03/11/2017, 16:59   
09 Traccia 09                                       17787361    741                     03/11/2017, 17:39   03/11/2017, 16:58   192 48000       MPEG audio file                 5   08/03/2018, 10:58   
10 Traccia 10                                       10128290    421                     03/11/2017, 17:39   03/11/2017, 16:58   192 48000       MPEG audio file                 1   03/11/2017, 16:58   

When I use this code to read it, the program hangs. 当我使用此代码读取它时,程序挂起。 (The i holds the number of lines in the file). (i保存文件中的行数)。 The hex dumps which follow seem to show the export from iTunes is in 16bit characters. 接下来的十六进制转储似乎表明从iTunes导出的内容为16位字符。

The complete code for reading the text file is 读取文本文件的完整代码是

file_name="full path to file goes here"
f = open(file_name, "r")
i=227
for x in range(0, i):
        line = f.readline()

When I read the code into text wrangler, selected all the text, and pasted it into a new document. 当我将代码读入文本编辑器时,选择了所有文本,并将其粘贴到新文档中。 The code worked fine. 该代码工作正常。

A text dump of part of the original file looks like this to start with the new file following 原始文件一部分的文本转储看起来像这样,从下面的新文件开始

00000000: FF FE 4E 00 61 00 6D 00 65 00 09 00 41 00 72 00   ..N.a.m.e...A.r.
00000010: 74 00 69 00 73 00 74 00 09 00 43 00 6F 00 6D 00   t.i.s.t...C.o.m.
00000020: 70 00 6F 00 73 00 65 00 72 00 09 00 41 00 6C 00   p.o.s.e.r...A.l.
00000030: 62 00 75 00 6D 00 09 00 47 00 72 00 6F 00 75 00   b.u.m...G.r.o.u.
00000040: 70 00 69 00 6E 00 67 00 09 00 57 00 6F 00 72 00   p.i.n.g...W.o.r.
00000050: 6B 00 09 00 4D 00 6F 00 76 00 65 00 6D 00 65 00   k...M.o.v.e.m.e.
00000060: 6E 00 74 00 20 00 4E 00 75 00 6D 00 62 00 65 00   n.t. .N.u.m.b.e.
00000070: 72 00 09 00 4D 00 6F 00 76 00 65 00 6D 00 65 00   r...M.o.v.e.m.e.
00000080: 6E 00 74 00 20 00 43 00 6F 00 75 00 6E 00 74 00   n.t. .C.o.u.n.t.
00000090: 09 00 4D 00 6F 00 76 00 65 00 6D 00 65 00 6E 00   ..M.o.v.e.m.e.n.
000000A0: 74 00 20 00 4E 00 61 00 6D 00 65 00 09 00 47 00   t. .N.a.m.e...G.
000000B0: 65 00 6E 00 72 00 65 00 09 00 53 00 69 00 7A 00   e.n.r.e...S.i.z.
000000C0: 65 00 09 00 54 00 69 00 6D 00 65 00 09 00 44 00   e...T.i.m.e...D.
000000D0: 69 00 73 00 63 00 20 00 4E 00 75 00 6D 00 62 00   i.s.c. .N.u.m.b.
000000E0: 65 00 72 00 09 00 44 00 69 00 73 00 63 00 20 00   e.r...D.i.s.c. .
000000F0: 43 00 6F 00 75 00 6E 00 74 00 09 00 54 00 72 00   C.o.u.n.t...T.r.

New file 新文件

0000: 4E 61 6D 65 09 41 72 74 69 73 74 09 43 6F 6D 70   Name.Artist.Comp
0010: 6F 73 65 72 09 41 6C 62 75 6D 09 47 72 6F 75 70   oser.Album.Group
0020: 69 6E 67 09 57 6F 72 6B 09 4D 6F 76 65 6D 65 6E   ing.Work.Movemen
0030: 74 20 4E 75 6D 62 65 72 09 4D 6F 76 65 6D 65 6E   t Number.Movemen
0040: 74 20 43 6F 75 6E 74 09 4D 6F 76 65 6D 65 6E 74   t Count.Movement
0050: 20 4E 61 6D 65 09 47 65 6E 72 65 09 53 69 7A 65    Name.Genre.Size

Your file beginning looks like UTF-16 - see Byte order marks - Wikipedia 您的文件开头看起来像UTF-16-请参阅字节顺序标记-Wikipedia

Use 采用

file_name="full path to file goes here"

with io.open(file_name,'r', encoding='utf-16-le') as f:
    for line in f:
        # do something with line 

when opening it. 打开时。

There is no need to use range() or readlines() when reading line by line. 逐行读取时无需使用range()或readlines()。 If you really need the line-numbers use: 如果您确实需要行号,请使用:

    for lineNr,line in enumerate(f):

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何将 16 位参数/输入通过管道传输到 python 程序中? - How to get 16bit arguments/inputs piped into a python program? 如何转换从文本文件中读取的整数并存储为具有16位整数的二进制文件? - How to convert integer numbers read from a text file and store as a binary file having 16bit integers? python 来自一系列图像的 16 位灰度视频 - python 16bit grayscale video from series of images 从python中的.txt文件中读取特殊字符 - Read special characters from .txt file in python 将透明徽标添加到 2D 16 位图像数组 python openCV2 - Add transparent logo to 2D 16bit image array python openCV2 在python中使用表情符号读取.txt - Read .txt with emoji characters in python Python 3.5-将bytes对象转换为16位十六进制字符串(b'\\ x07 \\ x89'->'0x0789') - Python 3.5 - Convert bytes object to 16bit hex string (b'\x07\x89' -> '0x0789') 读取 Python 中的 a.txt 文件,避免使用特殊字符替换文件内的原始字符 - Read a .txt file in Python avoiding special characters to replace original characters inside the file Python读取.txt文件->列表 - Python read .txt File -> list 如何读取具有字节格式的 16 位带符号整数的整个二进制文件,并将文件中的内容打印为 python 中的整数数组? - How to read the entire binary file which has 16 bit signed integers in byte format and print the content in file as array of integers in python?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM