简体   繁体   English

打印到文件的倒数第二行

[英]Printing to the penultimate line of a file

I am wondering if there is a simple way to get to the penultimate line of an open file. 我想知道是否有一种简单的方法可以到达打开文件的倒数第二行。 f.seek is giving me no end of trouble. f.seek没有给我带来麻烦。 I can easily get to the final line, but I can't figure out how to get to the line above that. 我可以轻松到达最后一行,但是我不知道如何到达最后一行。

假设文件不是太大,并且内存不是问题

open('file.txt').readlines()[-2]

You can seek from the end of the file and count number of newlines encountered, as soon as you hit the second '\\n' stop and call readline() : 您可以从文件末尾查找并计算遇到的换行符的数目,只要您点击第二个'\\n'停止点并调用readline()

with open('foo.txt') as f:
    end_count = 0
    n = -1
    while end_count != 2:
        f.seek(n, 2)
        if f.read(1) == '\n':
            end_count += 1
        n -= 1
    print repr(f.readline())

For a file like: 对于像这样的文件:

first line
second line
third line
fourth line
I want this line
last line

The output will be: 输出将是:

'I want this line\n'

Files are a single long string of bytes on most systems (some have forks, extents or records), leaving the concept of lines to a higher level. 在大多数系统上,文件是一个单个的长字节串(某些系统具有派生,扩展或记录),因此将行的概念推向了更高的层次。 Complicating matters further, the line ending doesn't look the same way on all platforms. 事情变得更加复杂,行尾在所有平台上看起来都不一样。 This means you have to read the lines to identify them, and specifically for text files you can only seek() to places you found using tell(). 这意味着您必须阅读这些行以识别它们,特别是对于文本文件,您只能将seek()到使用tell()找到的位置。

If we're just reading the penultimate line, it's simple: 如果我们只是在阅读倒数第二行,那很简单:

alllines=fileobject.readlines()
penultimateline=alllines[-2]

That approach loads the entire file into memory. 这种方法将整个文件加载到内存中。 If we want to replace the end of the file, starting with the penultimate line, things get hairier: 如果我们要替换文件的末尾(从倒数第二行开始),事情会变得更加棘手:

pen,last = 0,0
while True:
  last,pen = fileobject.tell(), last
  line = fileobject.readline()
  if not line:
    break
# back up to the penultimate line
fileobject.seek(pen)    # Note: seek is *required* to switch read/write
fileobject.truncate()
fileobject.write("Ate last two lines, this is a new line.\n")

If you merely want to read lines in an arbitrary order, linecache might be helpful. 如果您只是想以任意顺序读取行,则行缓存可能会有所帮助。

Each of these scans the entire file. 每一个都扫描整个文件。 Tools like tail may make another optimization: read data near the end of the file, until you've found enough newlines to identify the lines you need. 诸如tail之类的工具可能会进行另一项优化:读取文件末尾的数据,直到找到足够的换行符来标识所需的行。 This gets more complicated beause the seeking only works predictably in binary mode but the line parsing only works predictably in text mode. 由于查找仅在二进制模式下可预测地起作用,而行解析仅在文本模式下可预测地起作用,因此这变得更加复杂。 That in turn means our guess that the file is separated by linesep could be wrong; 反过来,这意味着我们猜测文件由lineep分隔可能是错误的; Python's universal newline support only operates in text mode. Python的通用换行符支持仅在文本模式下运行。

backsearch=0
lines=[]
while len(lines)<=2:
  backsearch+=200
  fileobject.seek(-backsearch, 2)
  if fileobject.tell()==0:
    break   # ran out of file while scanning backwards
  lines=fileobject.read().split(os.linesep)
fileobject.seek(-backsearch, 2)
# Now repeat the earlier method, knowing you're only processing 
# the final part of the file. 
def penultimate(file_path)
    return open(file_path).read().splitlines()[len(open(file_path).read().splitlines()) - 2]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM