简体   繁体   中英

Read previous line in a file python

I need to get the value of the previous line in a file and compare it with the current line as I'm iterating through the file. The file is HUGE so I can't read it whole or randomly accessing a line number with linecache because the library function still reads the whole file into memory anyway.

EDIT I'm so sorry I forgot the mention that I have to read the file backwardly.

EDIT2

I have tried the following:

 f = open("filename", "r")
 for line in reversed(f.readlines()): # this doesn't work because there are too many lines to read into memory

 line = linecache.getline("filename", num_line) # this also doesn't work due to the same problem above. 

Just save the previous when you iterate to the next

prevLine = ""
for line in file:
    # do some work here
    prevLine = line

This will store the previous line in prevLine while you are looping

edit apparently OP needs to read this file backwards:

aaand after like an hour of research I failed multiple times to do it within memory constraints

Here you go Lim, that guy knows what he's doing, here is his best Idea:

General approach #2: Read the entire file, store position of lines

With this approach, you also read through the entire file once, but instead of storing the entire file (all the text) in memory, you only store the binary positions inside the file where each line started. You can store these positions in a similar data structure as the one storing the lines in the first approach.

Whever you want to read line X, you have to re-read the line from the file, starting at the position you stored for the start of that line.

Pros: Almost as easy to implement as the first approach Cons: can take a while to read large files

@Lim, here's how I would write it (reply to the comments)

def do_stuff_with_two_lines(previous_line, current_line):
    print "--------------"
    print previous_line
    print current_line

my_file = open('my_file.txt', 'r')

if my_file:
    current_line = my_file.readline()

for line in my_file:

    previous_line = current_line
    current_line = line

    do_stuff_with_two_lines(previous_line, current_line)

I'd write a simple generator for the task:

def pairwise(fname):
    with open(fname) as fin:
        prev = next(fin)
        for line in fin:
            yield prev,line
            prev = line

Or, you can use the pairwise recipe from itertools :

def pairwise(iterable):
    "s -> (s0,s1), (s1,s2), (s2, s3), ..."
    a, b = itertools.tee(iterable)
    next(b, None)
    return itertools.izip(a, b)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM