简体   繁体   English

读取文件python中的上一行

[英]Read previous line in a file python

I need to get the value of the previous line in a file and compare it with the current line as I'm iterating through the file. 我需要获取文件中前一行的值,并在迭代文件时将其与当前行进行比较。 The file is HUGE so I can't read it whole or randomly accessing a line number with linecache because the library function still reads the whole file into memory anyway. 该文件是巨大的所以我无法读取整个或随机访问行号与linecache因为库函数仍然会将整个文件读入内存。

EDIT I'm so sorry I forgot the mention that I have to read the file backwardly. 编辑我很抱歉我忘了提到我必须向后阅读文件。

EDIT2 EDIT2

I have tried the following: 我尝试过以下方法:

 f = open("filename", "r")
 for line in reversed(f.readlines()): # this doesn't work because there are too many lines to read into memory

 line = linecache.getline("filename", num_line) # this also doesn't work due to the same problem above. 

Just save the previous when you iterate to the next 只需在迭代到下一个时保存上一个

prevLine = ""
for line in file:
    # do some work here
    prevLine = line

This will store the previous line in prevLine while you are looping 这将在prevLine时将前一行存储在prevLine

edit apparently OP needs to read this file backwards: 编辑显然OP需要向后读取此文件:

aaand after like an hour of research I failed multiple times to do it within memory constraints aaand经过一个小时的研究后,我多次失败,在内存限制内完成

Here you go Lim, that guy knows what he's doing, here is his best Idea: 在这里你去Lim,那家伙知道他在做什么,这是他最好的想法:

General approach #2: Read the entire file, store position of lines 一般方法#2:读取整个文件,存储行的位置

With this approach, you also read through the entire file once, but instead of storing the entire file (all the text) in memory, you only store the binary positions inside the file where each line started. 使用这种方法,您还可以读取整个文件一次,但不是将整个文件(所有文本)存储在内存中,而是仅将二进制位置存储在每行开始的文件中。 You can store these positions in a similar data structure as the one storing the lines in the first approach. 您可以将这些位置存储在与第一种方法中存储线的数据结构类似的数据结构中。

Whever you want to read line X, you have to re-read the line from the file, starting at the position you stored for the start of that line. 当您想要读取X行时,您必须从文件中重新读取该行,从您为该行开头存储的位置开始。

Pros: Almost as easy to implement as the first approach Cons: can take a while to read large files 优点:几乎与第一种方法一样容易实现缺点:可能需要一段时间才能读取大文件

@Lim, here's how I would write it (reply to the comments) @Lim,这是我写的方式(回复评论)

def do_stuff_with_two_lines(previous_line, current_line):
    print "--------------"
    print previous_line
    print current_line

my_file = open('my_file.txt', 'r')

if my_file:
    current_line = my_file.readline()

for line in my_file:

    previous_line = current_line
    current_line = line

    do_stuff_with_two_lines(previous_line, current_line)

I'd write a simple generator for the task: 我为这个任务写了一个简单的生成器:

def pairwise(fname):
    with open(fname) as fin:
        prev = next(fin)
        for line in fin:
            yield prev,line
            prev = line

Or, you can use the pairwise recipe from itertools : 或者,您可以使用itertoolspairwise配方:

def pairwise(iterable):
    "s -> (s0,s1), (s1,s2), (s2, s3), ..."
    a, b = itertools.tee(iterable)
    next(b, None)
    return itertools.izip(a, b)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM