简体   繁体   中英

Python, how can I set iterator of enumerate to 0 while reading same file again and again

with open("...txt") as fp: 
    for i, line in enumerate(fp): 
        if some condition : 
            i=0
            fp.seek(0)

Text is huge, GBs of data so I use enumerate. I need to process this huge file several thousands of time so I decided to open it just at first time for efficiency. However although this code works, i does not become 0 and it just goes on incrementing. I need that to be zero because I need position of lines i . And it is just inefficient to multiply billions*several thousands everytime and make some modular arithmetic.

So my question is how can I set i to be zero when I go back to the beginning of file? Thanks in advance (I use python 3.6)

You could always make your own resettable enumerator, but there are probably better ways to do what you really want to do.

Still, here's what a resettable enumerator looks like:

 def reset_enumerate(thing, start=0):
     x = start
     for t in thing:
         val = yield t, x
         if val is not None:
             x = val
         else:
             x += 1

Then you would use it like this:

r = reset_enumerate(range(10))
for i, num in r:
    print('i:', i, 'num:', num)     
    if i == 5:
        i, num = r.send(0)
        print('i:', i, 'num:', num)

Here is an example of how you can emulate a scenario like yours:

Assuming i have a file called input.txt with this kind of data:

1
2
3

Code:

j = 0
with open('input.txt', 'r') as f:
    for k in f:
        # A break condition
        # If not we'll face an infinite loop
        if j > 4:
            break
        if k.strip() == '2':
            f.seek(0)
            print("Return to position 0")
            # Don't forget to increment j 
            # Otherwise, we'll end up with an infinite loop
            j += 1
        print(k.strip())

Will output:

1
Return to position 0
2
1
Return to position 0
2
1
Return to position 0
2
1
Return to position 0
2
1
Return to position 0
2

As stated in the comment, enumerate is a generator function. It's "exhausted" by the time it completes. This is also why you can't just "reset" it. Here is the PEP on enumerate to further explain how it works.

Furthermore, as also indicated in the comments, this post provides the typical way to handle large files.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM