简体   繁体   中英

Python: Remove items from list while iterating, which have non-trivial removal conditions

This is not a dupe of other questions I have found such as:

Remove items from a list while iterating

Python: Removing list element while iterating over list

The problem is this: given a list of classes, such as abstracted sockets, what is the most Pythonic way to remove them, if there is a non-trivial way of determining if they should be removed?

sockets = [ socket1, socket2, socket3 ] # whatever
for sock in sockets:
    try:
        sock.close()
    except:
        pass
    else:
        remove the socket from the list here!

Cannot use the solution from either link. The "best" solution I can think of with my limited Python knowledge is to create a new list with only the ones that encountered exceptions appended to it.

sockets = [ socket1, socket2, socket3 ] # whatever
newsockets = []
for sock in sockets:
    try:
        sock.close()
    except:
        newsockets.append(sock)
sockets = newsockets

This still feels wrong, however. Is there a better way?

EDIT for moderator who ignored my explicit statement that the question this was marked as a dupe of is not a dupe.

To the first link I posted, you cannot use try/except in a list comprehension. To the second link (the one it was marked as a dupe of), as the comments say, that is a bad solution. remove(non-hashable item or item that doesn't have __eq__) does not work.

Usually the proper way algorithmically is to build a new list, then either replace the original list with the new list, or slice it in there, if you expect to remove a substantial number of the sockets:

sockets = [socket1, socket2, socket3] # whatever
new_sockets = []
for sock in sockets:
    try:
        sock.close()
    except:
        new_sockets.append(sock)

sockets[:] = new_sockets

The reason for this is that removal of an item that is at an arbitrary location (eg by using sockets.remove ) within the list of n items will have time complexity of O(n) on average, and if you end up removing k items, then the total complexity will be of O(kn) , whereas constructing a new list and replacing the original with new will have time complexity of the scale of O(n) , ie it doesn't depend on the number of sockets removed.


Or, as sockets are hashable, perhaps you should use a set to store them instead. Then, you need to either construct a new collection from the set to make it possible to iterate and remove at the same time (here I am using a list):

sockets = {socket1, socket2, socket3}
for sock in list(sockets):
    try:
        sock.close()
    except: 
        pass
    else:
        sockets.discard(sock)

Creating a new list has O(n) time complexity, but set.discard has only O(1) complexity.

Another way which gets rid of copying the data structure is to use another set for items that are to be removed :

sockets = {socket1, socket2, socket3}
to_remove = set()   # create an initially empty set
for sock in sockets:
    try:
        sock.close()
    except: 
        pass
    else:
        to_remove.add(sock)

# remove all sockets that are in to_remove in one 
# operation from sockets.
sockets.difference_update(to_remove)

This has a favourable running time over the other set example in case there are very few if any items to be removed.

This answer only adds a little explanation as to why you might want to assign the result to a slice of your original list. @Antti Happala has already given some nice alternatives for potentially improving the runtime.

When you write

sockets = newsockets

you are assigning the new list to the variable named sockets , whereas if you say

sockets[:] = newsockets

you are essentially replacing the entries of the same list with the new list.

The difference becomes more clear if you previously had another reference to your sockets list like so:

s = sockets

then, after assigning without slicing, s would still point to the old list with none of the sockets removed and sockets would refer to the new list.

Using the slicing version both s and sockets would refer to the same, updated list after removing some elements.

Usually this in-place replacement is what you want, however, there are also times when you might specifically want s and sockets to refer to different versions of the list, in which case you should not assign to a list slice. It all depends.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM