简体   繁体   English

检查集合/列表中至少一个元素是否在列表/集合集合中的每个元素中的最快方法

[英]fastest way to check if atleast one element in set/list is in each element in a collection of lists/sets

I have the following:我有以下内容:

list1 = {"a", "b", "c"}

list2 = [
    {"a", "s", "d", "f"},
    {"q", "w", "e", "c"},
    {"v", "b", "n", "m"},
]

i now want to check that elements in list1 are somewhere in list2.我现在想检查 list1 中的元素是否在 list2 中的某个位置。 each element in list2 MUST contain one of the elements in list1. list2 中的每个元素必须包含 list1 中的一个元素。

what i currently do is the following (also found it on stackoverflow a while ago):我目前所做的是以下内容(前一段时间也在 stackoverflow 上找到了它):

all(list1 & l for l in list2)

this is actually reasonably fast.这实际上相当快。 however I am now running into the issue that I have billions of different list1, so I have to come up with a faster solution.但是我现在遇到了一个问题,我有数十亿个不同的 list1,所以我必须想出一个更快的解决方案。 I also tried numba, but I am struggling with nested lists, and sets are not supported.我也尝试过 numba,但我在嵌套列表中苦苦挣扎,并且不支持集合。

I have a bunch of items (like the sets in list2) that can represent that sets.我有一堆可以代表该集合的项目(如 list2 中的集合)。 for example, the first set in list2 consists of "a", "s", "d" and "f".例如,list2 中的第一个集合由“a”、“s”、“d”和“f”组成。 all of those characters "desribe" the first set in list2.所有这些字符都“描述”了 list2 中的第一组。

what I now want to do is find the shortest combination to describe list2.我现在要做的是找到描述 list2 的最短组合。 for example:例如:

list2 = [
    {"a", "s", "d", "f"},
    {"q", "w", "e", "c"},
    {"v", "b", "n", "m"},
    {"v", "l", "p", "o"},
]

here the shortest combination to describe list2 is a,q,v (a describes the first element, q the second and v elements 3 and 4)这里描述 list2 的最短组合是 a、q、v(a 描述第一个元素,q 描述第二个元素,v 描述元素 3 和 4)

the way i construct list1 would be to take我构造 list1 的方式是

U = set.union(*list2)

for list1 in itertools.combinations(U,3): #i loop over the combinations to find the minimum one, so combinations(U,2), combinations(U,3) ....
     ...

this works really well, even for very large numbers (100s of millions of combinations), however it is still somewhat limited.这非常有效,即使对于非常大的数字(数以百万计的组合)也是如此,但它仍然有些有限。 I would like to reduce it as much as I can.我想尽可能地减少它。 edit: the datastructure for list2 is as desribed above, a collection of sets containing strings (in my case its 3 character combinations), and so list1 is also a set of strings.编辑:list2 的数据结构如上所述,是一组包含字符串的集合(在我的例子中是 3 个字符的组合),因此 list1 也是一组字符串。

thanks谢谢

There is a simple optimization you can make,您可以进行一个简单的优化,

not any(map(list1.isdisjoint, list2))

isdisjoint avoids needing to calculate the full result, and map is faster than a comprehension when you are just calling a single method. isdisjoint避免了计算完整结果的需要,并且map在您仅调用单个方法时比理解更快。

However, if you want a more optimal result you have to give more detail about what you are trying to do.但是,如果您想要更优的结果,则必须提供有关您尝试执行的操作的更多详细信息。 Particularly, what are the sizes of all of the data structures, and what are the elements they contain?特别是,所有数据结构的大小是多少,它们包含哪些元素?

what I now want to do is find the shortest combination to describe list2我现在要做的是找到描述 list2 的最短组合

This is the Hitting Set Problem , which is well studied and for which there exist multiple solvers, like this one .这是Hitting Set Problem ,它得到了很好的研究,并且存在多个求解器,例如这个

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Pandas groupby 在具有至少一个共同元素的列表列表中 - Pandas groupby on list of lists with atleast one element common 检查字典中是否存在作为值的列表(至少一个元素)并返回 - Check if list as value(with atleast one element) exists in dictionary and return it 在列表列表中的两个列表之间查找公共元素的最快方法 - Fastest way for finding a common element between two lists in a list of lists Python:将每个元素的不同列表列表连接到一个列表列表中 - Python: Join each element different lists of lists in one list of lists 如何在python中将一个包含5个元素的集合变成5个集合,每个集合一个元素? - how to turn one set with 5 elements into 5 sets each with one element in python? 检查列表列表中是否存在列表的最快方法 - Fastest way to check if a list is present in a list of lists 检查集合列表中的元素重叠 - Check element overlaps in list of sets Python将更改的元素从一个列表返回到另一个列表的最快方法 - Python fastest way to return changed element from one list to another 将列表列表与集合列表进行比较的最快方法 - Fastest way to compare list of lists against list of sets 将列表中的每个元素与另一个列表中的对应元素进行比较的最快方法是什么? - What is the fastest way to compare each element of a list with corresponding element of another list?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM