简体   繁体   English

python没有正确读取文本文件

[英]python not properly reading in text file

I'm trying to read in a text file that looks something like this: 我正在尝试阅读看起来像这样的文本文件:

Date, StartTime, EndTime 
6/8/14, 1832, 1903
6/8/14, 1912, 1918
6/9/14, 1703, 1708
6/9/14, 1713, 1750

and this is what I have: 这就是我所拥有的:

g = open('Observed_closure_info.txt', 'r')
closure_date=[]
closure_starttime=[]
closure_endtime=[]
file_data1 = g.readlines()
for line in file_data1[1:]:
    data1=line.split(', ')
    closure_date.append(str(data1[0]))
    closure_starttime.append(str(data1[1]))
    closure_endtime.append(str(data1[2]))

I did it this way for a previous file that was very similar to this one, and everything worked fine. 我这样做是为了一个与此文件非常相似的前一个文件,一切正常。 However, this file isn't being read in properly. 但是,此文件未正确读取。 First it gives me an error "list index out of range" for closure_starttime.append(str(data1[1])) and when I ask for it to print what it has for data1 or closure_date, it gives me something like 首先它为closure_starttime.append(str(data1[1]))给出了一个错误“list index out of range”,当我要求它打印它对data1或closure_date的内容时,它给了我类似的东西

['\x006\x00/\x008\x00/\x001\x004\x00,\x00 \x001\x008\x003\x002\x00,\x00 \x001\x009\x000\x003\x00\r\x00\n']

I've tried rewriting the text file in case there was something corrupt about that particular file, and it still does the same thing. 我已经尝试重写文本文件,以防有关该特定文件有任何损坏,它仍然做同样的事情。 I'm not sure why because last time this worked fine. 我不确定为什么,因为上次这个工作正常。

Any suggestions? 有什么建议? Thanks! 谢谢!

This looks like a comma-separated file with UTF-16 encoding (hence the \\x00 null bytes). 这看起来像一个逗号分隔的文件,具有UTF-16编码(因此\\x00空字节)。 You'll have to decode the input from UTF-16, like so: 您必须解码来自UTF-16的输入,如下所示:

import codecs

closure_date=[]
closure_starttime=[]
closure_endtime=[]
with codecs.open('Observed_closure_info.txt', 'r', 'utf-16-le') as g:
    g.next() # skip header line
    for line in g:
        date, start, end = line.strip().split(', ')
        closure_date.append(date)
        closure_starttime.append(start)
        closure_endtime.append(end)

try this 试试这个

g = open('Observed_closure_info.txt', 'r')
closure_date=[]
closure_starttime=[]
closure_endtime=[]
file_data1 = g.readlines()
for line in file_data1[1:]:
    data1=line.decode('utf-16').split(',')
    closure_date.append(str(data1[0]))
    closure_starttime.append(str(data1[1]))
    closure_endtime.append(str(data1[2]))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM