Okay, I have two lists, List 1 and List 2. I want to find all of the items that are in both list 1 and list 2, and remove them from list 1. The first way I've thought about doing this is looping through list 1 and then looping through list 2 to see if it is in list 2 but that seems slow and inefficient when scaled up. Is there a more efficient way of doing this?
Also, these lists will be ordered alphabetically (they're strings), if that helps anything.
I'm using python, but I'm also wondering from a general programming perspective.
list1 = ['bar','foo','hello','hi']
list2 = ['alpha','bar','hello','xam']
list1 would become ['foo','hi']
In python, you'd probably want to use a set:
intersection = set(list1).intersection(list2)
This will return a set
which destroys the order (among other things), but you can always use that set to filter list1
afterward:
list1 = [x for x in list1 if x not in intersection]
The intersection is most useful if you actually want to use the set. As pointed out in the comments, it's not actually necessary if you don't want a set at all:
set2 = set(list2)
list1 = [x for x in list1 if x not in set2]
Use a set
to get the difference between the two:
list1 = ['bar','foo','hello','hi']
list2 = ['alpha','bar','hello','xam']
set1 = set(list1)
set2 = set(list2)
set1 - set2
Outputs:
set(['hi', 'foo'])
As noted by @chepner, using set.difference, only the first needs to be converted to a set
set1.difference(list2)
If order is important, make one of them a set, and compare the other against it:
set2 = set(list2)
[x for x in list1 if x not in set2]
Outputs:
['foo', 'hi']
Here's a solution using a general programming approach, not using sets, and not particularly optimized. It relies on the two lists being sorted.
list1 = ['a', 'b', 'd', 'f', 'k']
list2 = ['c', 'd', 'i']
result = []
i1 = 0
i2 = 0
while i1 < len(list1) and i2 < len(list2):
# invariants:
# list1[i1] not in list2[:i2], and
# result == (list1[:i1] with elements of list2[:i2] omitted)
#
if list1[i1] < list2[i2]:
# By assumption, list1[i1] not in list2[:i2],
# and because list2 is sorted, the true 'if' condition
# implies that list1[i1] isn't in list2[i2:] either;
# that is, it isn't in list2 at all.
result.append(list1[i1])
i1 += 1
elif list1[i1] > list2[i2]:
# can't decide membership of list1[i1] yet;
# advance to next element of list2 and loop again
i2 += 1
else:
# list1[i1] == list2[i2], so omit this element
i1 += 1
i2 += 1
# Add any remaining elements of list1 to tail of result
if i1 < len(list1):
result.extend(list1[i1:])
print(result)
Result: ['a', 'b', 'f', 'k']
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.