简体   繁体   English

Python-读取块并在没有行的情况下中断

[英]Python - Reading chunks and break if there is no lines

This piece of code is reading a large file line by line, process each line then end the process whenever there is no new entry: 这段代码逐行读取一个大文件,处理每一行,然后在没有新条目时结束该过程:

file = open(logFile.txt', 'r')
count = 0

     while 1:
         where = file.tell()
         line = file.readline()
         if not line:
             count = count + 1
             if count >= 10:
               break
             time.sleep(1)
             file.seek(where)
         else:
            #process line 

In my experence, reading line by line takes very long time, so I tried to improve this code to read chunk of lines each time: 以我的经验,逐行读取会花费很长时间,因此我尝试改进此代码以每次读取一行行:

from itertools import islice
N = 100000
with open('logFile.txt', 'r') as file:
    while True:
       where = file.tell()
       next_n_lines = list(islice(file, N)).__iter__()
       if not next_n_lines:
          count = count + 1
          if count >= 10:
             break
          time.sleep(1)
          file.seek(where)
       for line in next_n_lines:
        # process next_n_lines

This works fine except for the ending part, it doen't end the process (break the while loop) even if there is no more lines in file. 除结尾部分外,此方法工作正常,即使文件中没有更多行,也不会结束进程(中断while循环)。 Any suggestions? 有什么建议么?

The original code already reads large chunks of the file at a time, it just returns one line of the data at a time. 原始代码已经一次读取了大块文件,它一次只返回一行数据。 You've just added a redundant generator that takes 10 lines at a time, using the read line functionality of the file object. 您刚刚添加了一个冗余生成器,使用文件对象的读取行功能一次可占用10行。

With only a few exceptions, the best way to iterate over the lines in a file is as follows. 除了少数例外,迭代文件中各行的最佳方法如下。

with open('filename.txt') as f:
    for line in f:
        ...

If you need to iterate over preset numbers of lines at time then try the following: 如果您需要一次遍历预设的行数,请尝试以下操作:

from itertools import islice, chain

def to_chunks(iterable, chunksize):
    it = iter(iterable)
    while True:
        first = next(it)
        # Above raises StopIteration if no items left, causing generator
        # to exit gracefully.
        rest = islice(it, chunksize-1)
        yield chain((first,), rest)


with open('filename.txt') as f:
    for chunk in to_chunks(f, 10):
        for line in chunk:
            ...

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM