简体   繁体   English

两个列表的Python交集保持重复

[英]Python intersection of two lists keeping duplicates

I have two flat lists where one of them contains duplicate values. 我有两个平面列表,其中一个包含重复值。 For example, 例如,

array1 = [1,4,4,7,10,10,10,15,16,17,18,20]
array2 = [4,6,7,8,9,10]

I need to find values in array1 that are also in array2, KEEPING THE DUPLICATES in array1. 我需要在array1中也找到array2中的值,并保留array1中的重复项。 Desired outcome will be 预期的结果将是

result = [4,4,7,10,10,10]

I want to avoid loops as actual arrays will contain over millions of values. 我想避免循环,因为实际数组将包含数百万个值。 I have tried various set and intersect combinations, but just couldn't keep the duplicates.. 我尝试了各种布景和相交组合,但无法保留重复项。

Any help will be greatly appreciated! 任何帮助将不胜感激!

What do you mean you don't want to use loops? 您不想使用循环是什么意思? You're going to have to iterate over it one way or another. 您将不得不以一种或另一种方式对其进行迭代。 Just take in each item individually and check if it's in array2 as you go: 只需单独取出每个项目,然后检查是否在array2中:

items = set(array2)
found = [i for i in array1 if i in items]

Furthermore, depending on how you are going to use the result, consider having a generator: 此外,根据您将如何使用结果,请考虑使用生成器:

found = (i for i in array1 if i in array2)

so that you won't have to have the whole thing in memory all at once. 这样您就不必一次将整个内容存储在内存中。

There following will do it: 可以执行以下操作:

array1 = [1,4,4,7,10,10,10,15,16,17,18,20]
array2 = [4,6,7,8,9,10]
set2 = set(array2)
print [el for el in array1 if el in set2]

It keeps the order and repetitions of elements in array1 . 它保持array1中元素的顺序和重复。

It turns array2 into a set for faster lookups. 它将array2变成一个集合,以加快查找速度。 Note that this is only beneficial if array2 is sufficiently large; 注意,只有在array2足够大的情况下这才是有益的。 if array2 is small, it may be more performant to keep it as a list. 如果array2小,则将其保留为列表的性能可能更高。

从@Alex的答案开始,如果您还想提取每个令牌的索引,则方法如下:

found = [[index,i] for index,i in enumerate(array1) if i in array2]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM