简体   繁体   English

如何使用python编程获取文件中特定行数?

[英]How to get the specific number of lines in a file using python programming?

This is my code for getting 20 lines every time but f1.tell() gives last position of the file. 这是我的代码,每次获取20行,但是f1.tell()给出了文件的最后位置。 So i cannot get 20 lines for the next time. 因此,我下次无法获得20条线。 Can anyone help me to do this? 谁能帮我做到这一点? please

f1=open("sample.txt","r")
last_pos=0
while true:
    f1.seek(last_pos)
    for line,i in enumerate(f1.readlines()):
        if line == 20:
            last_pos=f1.tell()
            break
        else:
            print i
sample.txt file contains below data
1
2
3
4
.
.
.
.
40
I want Output like
1
2
3
4
.
.
.
20
20
21
22
23
24
25
.
.
.
.
39
39
40

Using readlines reads all the file: you reach the end, hence what you are experiencing. 使用readlines读取所有文件:到达末尾,因此就遇到了问题。

Using a loop with readline works, though, and gives you the position of the end of the 20th line (well, rather the start of the 21st) 但是,使用带有readline的循环是readline ,并且可以为您提供第20行末尾的位置(嗯,而不是第21行的起点)

Code (I removed the infinite loop BTW): 代码(我删除了无限循环顺便说一句):

f1=open("sample.txt","r")
last_pos=0
line=0
while True:
    l=f1.readline()
    if l=="":
        break
    line+=1
    if line == 20:
        last_pos=f1.tell()
        print(last_pos)
        break
f1.close()

You could iterate with for i,l in enumerate(f1): but iterators & ftell are not compatible (you get: OSError: telling position disabled by next() call ). 您可以for i,l in enumerate(f1):中用for i,l in enumerate(f1):迭代for i,l in enumerate(f1):但是迭代器和ftell不兼容(您将获得: OSError: telling position disabled by next() call )。

Then, to seek to a given position, just f1.seek(last_pos) 然后,要找到给定位置,只需f1.seek(last_pos)

EDIT: if you need to print the line twice eveny 20 lines, you actually don't even need seek , just print the last line when you count 20 lines. 编辑:如果您需要打印两次,每20条线,实际上您甚至不需要seek ,当您计算20条线时只需打印最后一行。

BUT if you really want to do that this way, here's the way: 但是,如果您真的想用这种方式,那就是这样:

f1=open("sample.txt","r")
line=0
rew=False

while True:
    start_line_pos=f1.tell()
    l=f1.readline()
    if l=="":
        break
    print(l.strip())
    line+=1
    if rew:
        rew = False   # avoid re-printing the 20th line again and again
    elif line % 20 == 0:
        f1.seek(start_line_pos)
        rew = True
        line-=1   # pre-correct line counter
f1.close()

You notice a bit of logic to avoid getting stuck on line 20. It works fine: when reaching line 20, 40, ... it seeks back to the previously stored file position and reads 1 line again. 您会注意到一些逻辑,以避免卡在第20行上。它工作正常:到达第20、40行...时,它会返回到先前存储的文件位置,并再次读取1行。 The rew flag prevents to do that more than once. rew标志阻止执行多次操作。

Jean-François Fabre already explained that readline could read the whole file. Jean-FrançoisFabre已经解释了readline可以读取整个文件。

But anyway you should never mix line level input ( readline ) and byte level input ( tell , seek or read ). 但是无论如何,您永远不要混用行级输入( readline )和字节级输入( tellseekread )。 As the low level OS system calls can only read a byte count, readline actually reads a buffer and searches for a newline in it. 由于低级OS系统调用只能读取字节数,因此readline实际上会读取缓冲区并在其中搜索换行符。 But the file pointer given by tell is positionned at the end of the buffer and not at the end of the line. 但是tell给出的文件指针位于缓冲区的末尾而不是行的末尾。

There is no way to set the file pointer at the end of a line except by processing the file one char at a time and manually detecting the end of line. 除了通过一次处理一个字符并手动检测行尾之外,无法在行尾设置文件指针。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM