繁体   English   中英

Python程序不会读取16位字符的txt文件

[英]Python program will not read a txt file in 16bit characters

我的问题是如何让python读取文本为16位字符的文件。 文章的其余部分描述了这种情况。

我有一个文本文件,它是从iTunes导出的播放列表。 这是包含标题的一小段

Name    Artist  Composer    Album   Grouping    Work    Movement Number Movement Count  Movement Name   Genre   Size    Time    Disc Number Disc Count  Track Number    Track Count Year    Date Modified   Date Added  Bit Rate    Sample Rate Volume Adjustment   Kind    Equalizer   Comments    Plays   Last Played Skips   Last Skipped    My Rating
Keyboard Works of the Masters   Randolph Hokanson       Pan125b                         2054816 64                      03/11/2017, 18:00   03/11/2017, 17:01   256 44100       MPEG audio file         1   03/11/2017, 17:02   4   08/03/2018, 16:07   
08 Traccia 08                                       11159905    464                     03/11/2017, 17:39   03/11/2017, 16:59   192 48000       MPEG audio file                 1   03/11/2017, 16:59   
09 Traccia 09                                       17787361    741                     03/11/2017, 17:39   03/11/2017, 16:58   192 48000       MPEG audio file                 5   08/03/2018, 10:58   
10 Traccia 10                                       10128290    421                     03/11/2017, 17:39   03/11/2017, 16:58   192 48000       MPEG audio file                 1   03/11/2017, 16:58   

当我使用此代码读取它时,程序挂起。 (i保存文件中的行数)。 接下来的十六进制转储似乎表明从iTunes导出的内容为16位字符。

读取文本文件的完整代码是

file_name="full path to file goes here"
f = open(file_name, "r")
i=227
for x in range(0, i):
        line = f.readline()

当我将代码读入文本编辑器时,选择了所有文本,并将其粘贴到新文档中。 该代码工作正常。

原始文件一部分的文本转储看起来像这样,从下面的新文件开始

00000000: FF FE 4E 00 61 00 6D 00 65 00 09 00 41 00 72 00   ..N.a.m.e...A.r.
00000010: 74 00 69 00 73 00 74 00 09 00 43 00 6F 00 6D 00   t.i.s.t...C.o.m.
00000020: 70 00 6F 00 73 00 65 00 72 00 09 00 41 00 6C 00   p.o.s.e.r...A.l.
00000030: 62 00 75 00 6D 00 09 00 47 00 72 00 6F 00 75 00   b.u.m...G.r.o.u.
00000040: 70 00 69 00 6E 00 67 00 09 00 57 00 6F 00 72 00   p.i.n.g...W.o.r.
00000050: 6B 00 09 00 4D 00 6F 00 76 00 65 00 6D 00 65 00   k...M.o.v.e.m.e.
00000060: 6E 00 74 00 20 00 4E 00 75 00 6D 00 62 00 65 00   n.t. .N.u.m.b.e.
00000070: 72 00 09 00 4D 00 6F 00 76 00 65 00 6D 00 65 00   r...M.o.v.e.m.e.
00000080: 6E 00 74 00 20 00 43 00 6F 00 75 00 6E 00 74 00   n.t. .C.o.u.n.t.
00000090: 09 00 4D 00 6F 00 76 00 65 00 6D 00 65 00 6E 00   ..M.o.v.e.m.e.n.
000000A0: 74 00 20 00 4E 00 61 00 6D 00 65 00 09 00 47 00   t. .N.a.m.e...G.
000000B0: 65 00 6E 00 72 00 65 00 09 00 53 00 69 00 7A 00   e.n.r.e...S.i.z.
000000C0: 65 00 09 00 54 00 69 00 6D 00 65 00 09 00 44 00   e...T.i.m.e...D.
000000D0: 69 00 73 00 63 00 20 00 4E 00 75 00 6D 00 62 00   i.s.c. .N.u.m.b.
000000E0: 65 00 72 00 09 00 44 00 69 00 73 00 63 00 20 00   e.r...D.i.s.c. .
000000F0: 43 00 6F 00 75 00 6E 00 74 00 09 00 54 00 72 00   C.o.u.n.t...T.r.

新文件

0000: 4E 61 6D 65 09 41 72 74 69 73 74 09 43 6F 6D 70   Name.Artist.Comp
0010: 6F 73 65 72 09 41 6C 62 75 6D 09 47 72 6F 75 70   oser.Album.Group
0020: 69 6E 67 09 57 6F 72 6B 09 4D 6F 76 65 6D 65 6E   ing.Work.Movemen
0030: 74 20 4E 75 6D 62 65 72 09 4D 6F 76 65 6D 65 6E   t Number.Movemen
0040: 74 20 43 6F 75 6E 74 09 4D 6F 76 65 6D 65 6E 74   t Count.Movement
0050: 20 4E 61 6D 65 09 47 65 6E 72 65 09 53 69 7A 65    Name.Genre.Size

您的文件开头看起来像UTF-16-请参阅字节顺序标记-Wikipedia

采用

file_name="full path to file goes here"

with io.open(file_name,'r', encoding='utf-16-le') as f:
    for line in f:
        # do something with line 

打开时。

逐行读取时无需使用range()或readlines()。 如果您确实需要行号,请使用:

    for lineNr,line in enumerate(f):

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM