简体   繁体   中英

Python - Reading chunks and break if there is no lines

This piece of code is reading a large file line by line, process each line then end the process whenever there is no new entry:

file = open(logFile.txt', 'r')
count = 0

     while 1:
         where = file.tell()
         line = file.readline()
         if not line:
             count = count + 1
             if count >= 10:
               break
             time.sleep(1)
             file.seek(where)
         else:
            #process line 

In my experence, reading line by line takes very long time, so I tried to improve this code to read chunk of lines each time:

from itertools import islice
N = 100000
with open('logFile.txt', 'r') as file:
    while True:
       where = file.tell()
       next_n_lines = list(islice(file, N)).__iter__()
       if not next_n_lines:
          count = count + 1
          if count >= 10:
             break
          time.sleep(1)
          file.seek(where)
       for line in next_n_lines:
        # process next_n_lines

This works fine except for the ending part, it doen't end the process (break the while loop) even if there is no more lines in file. Any suggestions?

The original code already reads large chunks of the file at a time, it just returns one line of the data at a time. You've just added a redundant generator that takes 10 lines at a time, using the read line functionality of the file object.

With only a few exceptions, the best way to iterate over the lines in a file is as follows.

with open('filename.txt') as f:
    for line in f:
        ...

If you need to iterate over preset numbers of lines at time then try the following:

from itertools import islice, chain

def to_chunks(iterable, chunksize):
    it = iter(iterable)
    while True:
        first = next(it)
        # Above raises StopIteration if no items left, causing generator
        # to exit gracefully.
        rest = islice(it, chunksize-1)
        yield chain((first,), rest)


with open('filename.txt') as f:
    for chunk in to_chunks(f, 10):
        for line in chunk:
            ...

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM