简体   繁体   English

Python 3.3 Readlines截断文本文件

[英]Python 3.3 readlines truncating text file

I am working with Python 3.3 using PyDev for Eclipse, Alright, so this is my code: 我正在使用PyDev for Eclipse处理Python 3.3,好的,这是我的代码:

countdata = open(countfilename, 'r')
countlist = countdata.readlines()
print(len(countlist))
genecountline = wordlist(countlist[-1])
print(genecountline)

countfilename refers to a rather lengthy text file of 7847 lines that is generated from a text file using a script given to me by the instructor in my machine learning class (I did have to convert said script to Python 3 using 2to3). countfilename指的是一个很长的7847行文本文件,它是使用我的机器学习课程中的讲师给我的脚本从文本文件生成的(我确实必须使用2to3将所述脚本转换为Python 3)。

wordlist is a simple function I built that takes a line of text and returns the words in it as a list. wordlist是我构建的一个简单函数,它接受一行文本并将其中的单词作为列表返回。

I pull the whole file into a list of lines so that I an refer to specific lines at will for my calculation. 我将整个文件放入行列表中,以便我随意参考特定行进行计算。 Whether I read them in all at once with readlines or iterate over the file and add the lines to the list one by one like this: 我是使用readlines一次全部读取它们,还是遍历文件并将它们逐行添加到列表中,如下所示:

countdata = open(countfilename, 'r')
countlist = []
for line in countdata:
    countlist.append(line)

doesn't matter. 没关系 Either way I do it, print(len(countlist)) gives me approximately 7630 , I say approximately because sometimes it is as low as 7628 or as high as 7633 . 无论采用哪种方式, print(len(countlist))都会给我大约7630 ,我说大概是因为有时它低至7628或高至7633 The specific line returned by countlist[-1] is always different (the file is built using a generator object, as mentioned my instructor built that script and I am not entirely sure how exactly it works). countlist[-1]返回的特定行始终是不同的(该文件是使用生成器对象构建的,正如我的讲师所构建的脚本一样,我并不完全确定它的工作原理)。

genecountline = wordlist(countlist[-1])
print(genecountline)

I put in just to see what python thinks the last line of the file is. 我只是看python认为文件的最后一行是什么。 And when I open the file in textpad, the line it returns is in fact the line number returned by len(countlist) . 当我在textpad中打开文件时,它返回的行实际上是len(countlist)返回的行号。 In other words it appears to be ignoring the last approx. 换句话说,它似乎忽略了最后一个近似值。 210 lines of my file. 我文件的210行。 So my question is how do I fix this, and how do I prevent it from doing this again? 所以我的问题是如何解决这个问题,以及如何防止它再次发生?

If you're not reading from a static text file but from the one that generates each time you run your program, it could be that you don't close that file (in which case everything might not have been written to it). 如果不是从静态文本文件中读取文件,而是从每次运行程序时生成的文件中读取文件,则可能是您没有关闭该文件(在这种情况下,可能未将所有内容都写入该文件中)。 If you don't want to close it, you could flush it (.flush() method). 如果您不想关闭它,则可以将其刷新(.flush()方法)。

You should post the code that generates the file. 您应该发布生成文件的代码。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM