简体   繁体   English

使用python中的readline()来读取特定的行

[英]Use readline() from python to read specific line

When using readline() in python is it possible to specify what line to read? 在python中使用readline()时,是否可以指定要读取的行? When I run the following code, I get lines 1,2,3 but I would like to read lines 2,6,10 当我运行以下代码时,我得到第1,2,3行,但我想阅读第2,6,10行

def print_a_line(line, f):
    print f.readline(line)

current_file = open("file.txt")

for i in range(1, 12):
    if(i%4==2):
        print_a_line(i, current_file)

No, you can't use readline that way. 不,你不能这样使用readline。 Instead, skip over the lines you don't want. 相反,跳过你不想要的线。 You have to read through the file because you can't know ahead of time where to seek to to read a specific line (unless the newlines appear in some regular offset). 您必须通读该文件,因为您无法提前知道在哪里寻找特定的行(除非换行符出现在某个常规偏移量中)。 You can use enumerate to determine what line you're on, so you only have to read the file once and can stop after the location you don't care about. 您可以使用枚举来确定您所在的行,因此您只需要读取一次该文件,并且可以在您不关心的位置之后停止。

with open('my_file') as f:
    for i, line in enumerate(f, start=1):
        if i > 12:
            break
        if i % 4 == 0:
            print(i, line)

If you know that each line is a certain byte length, you can seek to the specific position for a given line, rather than iterating over the lines. 如果您知道每一行都是某个字节长度,您可以寻找给定行的特定位置,而不是遍历这些行。

line_len = 20  # bytes

with open('my_file', 'rb') as f:
    for i in range(0, 13, 4):
        f.seek(i * line_len)
        print(f.read(line_len).decode())

You can use the consume recipe from itertools, which is one of the fastest ways to skip lines: 您可以使用itertools中的消耗配方 ,这是跳过行的最快方法之一:

from itertools import islice
from collections import deque

def consume(iterator, n):
    "Advance the iterator n-steps ahead. If n is none, consume entirely."
    # Use functions that consume iterators at C speed.
    if n is None:
        # feed the entire iterator into a zero-length deque
        deque(iterator, maxlen=0)
    else:
        # advance to the empty slice starting at position n
        next(islice(iterator, n, n), None)

with open("in.txt") as f:
    l = []
    sm = 0
    for i in (2, 6, 10):
        i -= sm
        consume(f, i-1)
        l.append(next(f, ""))
        sm += i

We just need to subtract what we have already consumed so we keep the lines matching each i. 我们只需要减去已经消耗的东西,这样我们就可以保持每条线的匹配。 You can put the code in a function and yield each line: 您可以将代码放在函数中并生成每一行:

def get_lines(fle,*args):
    with open(fle) as f:
        l, consumed = [], 0
        for i in args:
            i -= consumed
            consume(f, i-1)
            yield next(f, "")
            consumed += i

To use just pass the filename and the line numbers: 要使用只传递文件名和行号:

test.txt: 的test.txt:

1
2
3
4
5
6
7
8
9
10
11
12

Output: 输出:

In [4]: list(get_lines("test.txt",2, 6, 10))
Out[4]: ['2\n', '6\n', '10\n']
In [5]: list(get_lines("stderr.txt",3, 5, 12))
Out[5]: ['3\n', '5\n', '12']

If you only wanted a single line you could also use linecache: 如果你只想要一行,你也可以使用linecache:

import linecache

linecache.getline("test.txt",10)
with open('file.txt', 'r') as f:
    next(f)
    for line in f:
        print(line.rstrip('\n'))
        for skip in range(3):
            try:
                next(f)
            except StopIteration:
                break

File: 文件:

1
2
3
4
5
6
7
8
9
10

Result: 结果:

2
6
10

This will work for a script or function, but if you want it to hide the skipped lines in the interactive shell you'll have to save the next(f) calls to a temporary variable. 这将适用于脚本或函数,但如果您希望它隐藏交互式shell中的跳过行,则必须将next(f)调用保存到临时变量。

Reading a file is always done from the first character. 始终从第一个字符读取文件。 The reader has no knowledge of the content, so it doesn't know where lines begin and end. 读者不了解内容,因此不知道行的开始和结束位置。 readline just reads until it observes a newline character. readline只是读取,直到它观察到换行符。 This is actually true for any language, not only Python. 这对任何语言都是如此,不仅仅是Python。 If you want to get the nth line, you can skip n-1 lines: 如果你想获得第n行,你可以跳过n-1行:

def my_readline(file_path, n):
    with open(file_path, "r") as file_handle:
        for _ in range(1, n):
            file_handle.readline()
        return file_handle.readline()

Do note that with this solution you need to open the file with each function call, which can seriously lower your program's performance. 请注意,使用此解决方案,您需要在每次调用函数时打开文件,这会严重降低程序的性能。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM