简体   繁体   中英

How to print the first n lines of file?

I'm sure I'm missing something obvious and probably asked before but I can't seem to get the right combination of keywords together to give me an answer.

How can I write out the first n lines of a file (in effect, the opposite of file.readlines()[0:10] )?

eg I have a function that takes in an input file, and needs to process information from the latter part, throwing out a header. However I want to keep the multi-line header, to be put back in to an output file.

def readInfile(infile):

    with open(infile, 'r') as ifh:
        # Skip exta info at top of file
        header = ifh.readline()[0:10] # Keep the header for later?

        noheader = ifh.readlines()[11:]
        for line in noheader:
        # Do the useful stuff
            usefulstuff = foo()

return usefulstuff, header

Then later I want to write out in the format of the input file, using their header:

print(header)
for thing in usefulstuff:
   print(thing)

Is there a method I'm missing, or is readlines no good for this as it returns a list?

I assumed

for line in header:
     print(line)

would work, but it doesn't seem to in this case - so I must be doing something wrong?

EDIT

Why does trying to use readlines()[] twice fail for the second range?

I fixed the code as @pbuck pointed out, that the header line should have been readlines() not readline but now the noheader variable is empty? Do I really have to open the file twice?!

Careful there, readline() returns a string, so ifh.readline()[0:10] is giving you the first few characters of the first line, and noheader = ifh.readline()[11:] gives you part of the next line.

What you could do is use loops like so:

header = ""
for i in range(10):
  header += ifh.readline()

Or as @pbuck suggests in their comment, use readlines() (note the s), which returns a list containing each line in your file, which looks more like what you were trying to do.

Literally, read first n lines, then stop.

def read_first_lines(filename, limit):
  result = []
  with open(filename, 'r') as input_file:
    # files are iterable, you can have a for-loop over a file.
    for line_number, line in enumerate(input_file):
      if line_number > limit:  # line_number starts at 0.
        break
      result.append(line)
  return result

There aren't two readlines() calls. Initially you call readline() which reads a single line from the file. Next you call readlines() and ignore the first 10 lines of the list it returns.

This would be a better way to do it:

def foo(lines):
    return ['foo: ' + line for line in lines]

def readInfile(infile):
    with open(infile, 'r') as ifh:
        lines = ifh.read().splitlines(False)  # read in the whole file, separate into lines
        header = lines[:10]
        usefulstuff = foo(lines[10:])

        return usefulstuff, header

usefulstuff, header = readInfile('name_of_file.txt')

for line in header:
    print(line)

for line in usefulstuff:
    print(line)

I have checked on your solution and it seams you are on track. Consider this solution using mmap python package ( https://docs.python.org/2/library/mmap.html ) where you can treat the file as a string as well as a file. Here is my solution:

import mmap

def main(offset):
    with open("pks.txt","r+b") as fd:
        #Get the lines to skip
        try:
            skip=fd.readlines()[0:offset]
            lines=sum([len(x) for x in skip])
            rfile=mmap.mmap(fd.fileno(),0)
            rfile.seek(lines)
            print("Header: %s"%skip)
            print("Other lines:")
            line=rfile.readline()
            usefulStuff=list()
            while (len(line)>0):
                usefulStuff.append(line.lstrip()) #Remove new line
                line=rfile.readline()
            return usefulStuff,skip
        except TypeError as e:
            #Handle this error when offset is greater than the file length
            print("Error: %s"%str(e))
    return None,None
if __name__=='__main__':
    footer,header=main(3)
    print("Header: %s\nFooter: %s"%(header,footer)) 

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM