简体   繁体   English

python设置与对象集的交集

[英]python set intersection with object sets

I am working with amazon boto and I have 2 lists. 我正在与亚马逊博托合作,我有2个名单。 List 1 contains Instance objects. 列表1包含实例对象。 List 2 contains InstanceInfo objects. 列表2包含InstanceInfo对象。 Both objects have a attribute called id. 两个对象都有一个名为id的属性。 I need to get a list of Instance objects which id exists in the InstanceInfo list. 我需要获取InstanceInfo列表中存在id的Instance对象列表。

l1 = [Instance:i-04072534, Instance:i-06072536, Instance:i-08072538, Instance:i-0a07253a, Instance:i-e68fa1d6, Instance:i-e88fa1d8, Instance:i-ea8fa1da, Instance:i-ec8fa1dc]

l2 = [InstanceInfo:i-ec8fa1dc, InstanceInfo:i-ea8fa1da, InstanceInfo:i-e88fa1d8, InstanceInfo:i-e68fa1d6]

Wanted result: 通缉结果:

l3 = [Instance:i-ec8fa1dc, Instance:i-ea8fa1da, Instance:i-e88fa1d8, Instance:i-e68fa1d6]

Right now I have it working through: 现在我有它通过:

l3= []
for a in l1  
    for b in l2:
        if a.id == b.id:
            l3.append(a)

However, I have been told that I should replace this using set intersection. 但是,有人告诉我,我应该使用set intersection替换它。 I have been looking at examples and it looks very straightforward. 我一直在看例子,看起来很简单。 Yet I don't see any examples working with objects. 但我没有看到任何使用对象的示例。

I've been playing around for a bit and theoretically I can see it work, yet there might be some 'advanced' syntax that I may not know off. 我已经玩了一段时间,理论上我可以看到它的工作,但可能有一些我可能不知道的'高级'语法。 I am still learning python. 我还在学习python。

Here is something faster than Marcin's answer (while being similar): 这是比Marcin的答案更快的东西(虽然相似):

ids_l1 = set(x.id for x in l1)  # All ids in list 1
intersection = [item for item in l2 if item.id in ids_l1]  # Only those elements of l2 with an id in l1

It is important to pre-calculate ids_l1 and to not write if item.id in set(…) , as the set would be reconstructed each time (as the full test expression is re-evaluated for each element item ). 重要的是预先计算ids_l1并且不在if item.id in set(…)写入if item.id in set(…) ,因为每次都要重建该集合(因为对每个元素item重新评估完整的测试表达式)。

Python sets give you fast element membership tests ( in ). Python集为您提供快速元素成员资格测试( in )。 Such tests are much faster with sets than with lists (as the elements of a list must be read one by one, whereas the elements of a set are "hashed"). 使用集合比使用列表要快得多(因为列表的元素必须逐个读取,而集合的元素是“散列”)。

Your method may be relatively efficient for smallish lists. 对于小型列表,您的方法可能相对有效。

With sets, you would have to extract the ids, calculate the intersection of ids, and then collect the items into your new list. 使用集合,您必须提取ID,计算ID的交集,然后将项目收集到新列表中。 Something like: 就像是:

set1 = set(x.id for x in l1)
set2 = set(x.id for x in l2)
intersection_ids = set1 & set2
intersection_list = [item for item in l2 if item.id in intersection_ids]

You can make this a little more efficient by scanning over the shorter list, or by storing your objects in a dict. 您可以通过扫描较短的列表或将对象存储在dict中来提高效率。

Try this: 尝试这个:

# get ids of elements in second list
l2_ids = [x.id for x in l2]
# get elements from first list that have ids in second
l3 = [x for x in l1 if x.id in l2_ids]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM