简体   繁体   中英

Deleting elements of a python list during iteration

I have a very large list on each element of which I have to do many operations. Essentially, each element of the list is appended to in various ways and then used to generate an object. These objects are then used to generate another list.

Unfortunately, doing this in a naive way takes up all of available memory.

I would therefore like to do the following:

for a in b:
    # Do many things with a
    c.append(C(modified_a))
    b[b.index(a)] = None # < Herein lies the rub

This seems to violate the idea that a list should not be modified during iteration. Is there a better way to do this kind of manual garbage collecting?

This shouldn't be a problem, since you're just assigning new values to list elements, not really deleting them.

But instead of searching for a with the index method, you should probably use enumerate.

See also here: http://unspecified.wordpress.com/2009/02/12/thou-shalt-not-modify-a-list-during-iteration/ "Firstly, let me be clear that in this article, when I say “modify”, I mean inserting or removing items from the list. Merely updating or mutating the list items is fine."

Your best bet is a generator :

def gen(b):
   for a in b:
      # Do many things with a
      yield a

Done properly here, no additional memory required.

There are several issues with your code.

First, assigning None to a list element does not delete it:

>>> l=[1,2,3,4,5,6,6,7,8,9]
>>> len(l)
10
>>> l[l.index(5)]=None
>>> l
[1, 2, 3, 4, None, 6, 6, 7, 8, 9]
>>> len(l)
10

Second, using an index to find the element that you want to change is not at all efficient way to do this.

You can use enumerate, but you would still need to loop through to delete the None values.

for i,a in enumerate(b):
    # Do many things with a
    b[i]=C(modified_a)
    b[i]=None 
c=[e for e in b if e is not None]

You could use a list comprehension to just copy the new 'a' values to the c list then delete b:

c=[do_many_things(a) for a in b]
del b                              # will still occupy memory if not deleted...

Or if you want b to be modified in place, you can use slice assignment :

b[:]=[do_many_things(a) for a in b]

Slice assignment works this way:

#shorted a list
>>> b=[1,2,3,4,5,6,7,8,9]
>>> b[2:7]=[None]
>>> b
[1, 2, None, 8, 9]

#expand a list
>>> c=[1,2,3]
>>> c[1:1]=[22,33,44]
>>> c
[1, 22, 33, 44, 2, 3]

# modify in place
>>> c=[1,2,3,4,5,6,7]
>>> c[0:7]=[11,12,13,14,15,16,17]
>>> c
[11, 12, 13, 14, 15, 16, 17]

You can use it in a list comprehension like so:

>>> c=list(range(int(1e6)))
>>> c[:]=[e for e in c if e<10]
>>> c
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

One of the comments pointed out that slice assignment does not modify in place exactly; that a temp list is generated. That is true. However, let's look at the total timings here:

import time
import random
fmt='\t{:25}{:.5f} seconds' 
count=int(1e5)
a=[random.random() for i in range(count)]
b=[e for e in a]

t1=time.time()
for e in b:
    if e<0.5: b[b.index(e)]=None  
c=[e for e in b if e is not None]    
print(fmt.format('index, None',time.time()-t1))

b=[e for e in a]
t1=time.time()
for e in b[:]:
    if e<0.5: del b[b.index(e)]  
print(fmt.format('index, del',time.time()-t1))

b=[e for e in a]
t1=time.time()
for i,e in enumerate(b[:]):
    if e<0.5: b[i]=None
c=[e for e in b if e is not None]    
print(fmt.format('enumerate, copy',time.time()-t1))

t1=time.time()
c=[e for e in a if e<.5]
del a
print(fmt.format('c=',time.time()-t1))

b=[e for e in a]
t1=time.time()
b[:]=[e for e in b if e<0.5]
print(fmt.format('a[:]=',time.time()-t1))

On my computer, prints this:

index, None              87.30604 seconds
index, del               28.02836 seconds
enumerate, copy          0.02923 seconds
c=                       0.00862 seconds
a[:]=                    0.00824 seconds

Or, use numpy for more optimized array options if this does not help.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM