简体   繁体   English

Python无法读取整个txt文件

[英]Python doesn't read whole txt file

I'm learning python fundamentals and have exercise where I need to read txt file and print contents of the file line by line. 我正在学习python基础知识,并进行锻炼,需要逐行读取txt文件和打印文件内容。 Here is my code: 这是我的代码:

t = open('mbox-short.txt')

for line in t:
    print(line)

And here is the file: https://www.py4e.com/code3/mbox-short.txt 这是文件: https : //www.py4e.com/code3/mbox-short.txt

Problem is that when I run script, output doesn't show few first lines of the file. 问题是当我运行脚本时,输出不会显示文件的前几行。 First line in mentioned above original file is: 上述原始文件中的第一行是:

From stephen.marquard@uct.ac.za Sat Jan  5 09:14:16 2008

and each time I run my script it shows content beginning from lines: 每次我运行脚本时,它都会从以下几行开始显示内容:

Received: from nakamura.uits.iupui.edu (localhost [127.0.0.1])

        by nakamura.uits.iupui.edu (8.12.11.20060308/8.12.11) with ESMTP id m04GA5LR007211

Please help me to understand what I'm doing wrong and how can I fix it. 请帮助我了解我在做什么错以及如何解决。 As I correctly understand, it has something to do with the txt file encoding. 据我正确理解,它与txt文件编码有关。 I've tried to download and copy it several time, changed encoding from ANSI to UTF-8 via notepad but everytime i run script it prints same output skipping few first lines and starting only from line: 我试图下载和复制它几次,通过记事本将编码从ANSI更改为UTF-8,但是每次我运行脚本时,它都会跳过几行而仅从行开始打印相同的输出:

Received: from nakamura.uits.iupui.edu (localhost [127.0.0.1])

Also would like to mention that I've tried to read random downloaded from the web robots.txt files and script reads everything as it should without skipping any lines. 还想提到我尝试读取从网络robots.txt文件中随机下载的内容,并且脚本会按需读取所有内容,而不会跳过任何行。 I'm using Windows 8.1 64 bit and latest Python 3.6.5. 我正在使用Windows 8.1 64位和最新的Python 3.6.5。 Thank you. 谢谢。

I didn't have any issue doing this : 我这样做没有任何问题:

>>> with open('./mbox-short.txt', 'r') as f:
        txt = f.read()

>>> print(txt.splitlines()[0])  # display the first line 
'From stephen.marquard@uct.ac.za Sat Jan  5 09:14:16 2008'

So I'd suggest you slightly modify your code and first read() the text file, and then use splitlines() to iterate over lines. 因此,建议您稍微修改一下代码,然后先read()文本文件,然后使用splitlines()遍历行。

Perhaps something like this to just print the lines? 也许像这样只是打印行?

with open("./mbox-short.txt", "r") as ins:
    for line in ins:
        print(line)

you can open and try to find the first line using Positive Indexing using [0] if you want to find the first word use split() only , if you want to find the first line use splitlines() 您可以打开并尝试使用[0]使用正索引找到第一行,如果您只想使用split()查找第一词,如果您想找到第一行使用splitlines()

f = open('mbox-short.txt', 'r').read()
print f.split()[0] # Using Positive Indexing

Output: 输出:

>>> print f.split()[0]  
From
>>>>

now we will Find the first Line 现在我们将找到第一行

f = open('file.txt', 'r').read()
print f.splitlines()[0] # First Line Using Positive Indexing

Output: 输出:

>>> print f.splitlines()[0]
From stephen.marquard@uct.ac.za Sat Jan  5 09:14:16 2008
>>>

or you can Also Do it With readline(): 或者您也可以使用readline()来做到这一点:

f = open('mbox-short.txt', 'r').readline()
print f

Output: 输出:

>>> f = open('mbox-short.txt', 'r').readline()
>>> print f
From stephen.marquard@uct.ac.za Sat Jan  5 09:14:16 2008
>>>

Thanks! 谢谢!

@halfelf posted right answer in comments so I'll copy it here: @halfelf在评论中发布了正确答案,因此我将其复制到此处:

I guess it's just your cmd/powershell's buffer can't contain 1910 lines of that file, and the beginning lines has been scrolled 我想这只是您的cmd / powershell的缓冲区不能包含该文件的1910行,并且开始的行已经滚动

I just increased buffer size in cmd properties and now it shows all lines. 我只是增加了cmd属性中的缓冲区大小,现在它显示了所有行。 Thank you all for your answers, I appreciate it. 谢谢大家的回答,谢谢。

for get all the lines , you must do like that: 要获取所有行,您必须这样做:

t = open('mbox-short.txt', 'r').readlines()
for n in t:
  line = n.strip()
  print line

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM