简体   繁体   中英

Printing to the penultimate line of a file

I am wondering if there is a simple way to get to the penultimate line of an open file. f.seek is giving me no end of trouble. I can easily get to the final line, but I can't figure out how to get to the line above that.

假设文件不是太大,并且内存不是问题

open('file.txt').readlines()[-2]

You can seek from the end of the file and count number of newlines encountered, as soon as you hit the second '\\n' stop and call readline() :

with open('foo.txt') as f:
    end_count = 0
    n = -1
    while end_count != 2:
        f.seek(n, 2)
        if f.read(1) == '\n':
            end_count += 1
        n -= 1
    print repr(f.readline())

For a file like:

first line
second line
third line
fourth line
I want this line
last line

The output will be:

'I want this line\n'

Files are a single long string of bytes on most systems (some have forks, extents or records), leaving the concept of lines to a higher level. Complicating matters further, the line ending doesn't look the same way on all platforms. This means you have to read the lines to identify them, and specifically for text files you can only seek() to places you found using tell().

If we're just reading the penultimate line, it's simple:

alllines=fileobject.readlines()
penultimateline=alllines[-2]

That approach loads the entire file into memory. If we want to replace the end of the file, starting with the penultimate line, things get hairier:

pen,last = 0,0
while True:
  last,pen = fileobject.tell(), last
  line = fileobject.readline()
  if not line:
    break
# back up to the penultimate line
fileobject.seek(pen)    # Note: seek is *required* to switch read/write
fileobject.truncate()
fileobject.write("Ate last two lines, this is a new line.\n")

If you merely want to read lines in an arbitrary order, linecache might be helpful.

Each of these scans the entire file. Tools like tail may make another optimization: read data near the end of the file, until you've found enough newlines to identify the lines you need. This gets more complicated beause the seeking only works predictably in binary mode but the line parsing only works predictably in text mode. That in turn means our guess that the file is separated by linesep could be wrong; Python's universal newline support only operates in text mode.

backsearch=0
lines=[]
while len(lines)<=2:
  backsearch+=200
  fileobject.seek(-backsearch, 2)
  if fileobject.tell()==0:
    break   # ran out of file while scanning backwards
  lines=fileobject.read().split(os.linesep)
fileobject.seek(-backsearch, 2)
# Now repeat the earlier method, knowing you're only processing 
# the final part of the file. 
def penultimate(file_path)
    return open(file_path).read().splitlines()[len(open(file_path).read().splitlines()) - 2]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM