简体   繁体   中英

Python 2.7: how to read only a few lines at a time from a file?

For example, I have 2,000 lines in a file, and I want to read 500 lines at a time and do something with these 500 lines before reading another 500 lines. I wonder if anyone would write some quick code for me to learn. Thanks!

You could use a generator to group the lines together, and yield them in a way that is convenient to use in a simple for loop. This might get you started:

def chunks_of(iterable, chunk_size=500):
    out = []
    for item in iterable:
        out.append(item)
        if len(out) >= chunk_size:
            yield out
            out = []
    if out:
        yield out

You can then use this like:

for chunk_of_lines in chunks_of(file('/path/to/file'), chunk_size=500):
    # chunk_of_lines is 500 or fewer lines from the file

(Why "500 or fewer"? Because the last chunk might not be 500 lines if the number of lines in the file was not an even multiple of 500.)

Edit: Always check the docs first. Here's a recipe from the itertools docs

def grouper(n, iterable, fillvalue=None):
    "grouper(3, 'ABCDEFG', 'x') --> ABC DEF Gxx"
    args = [iter(iterable)] * n
    return izip_longest(fillvalue=fillvalue, *args)

This creates a list of n iterators on the iterable (in this case, the file object) -- since they are all iterators on the same underlying object, when one advances, the rest will all advance as well -- and then zips their result. izip_longest works like izip , but pads its results with the fillvalue , rather than simply omitting them, as my chunks_of function does.

You could also use itertools.islice to read 500 lines at a time:

lines = itertools.islice(file_obj, 500)

Correct me but i think that this very basic sample will work too:

linesToProceed = 500
with open(filename, 'r') as f:
    lines = []
    for i,line in enumerate(f):
        if (i + 1) % linesToProceed:
            # do something with lines in lines
            lines = []
        else:
            lines.append(line)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM