简体   繁体   English

查找列表 A 中出现的元素数量比在类似但混合的列表 B 中出现的次数

[英]Finding the count of how many elements of list A appear before than in the similar but mixed list B

A=[2,3,4,1] B=[1,2,3,4] I need to find how many elements of list A appear before than the same element of list B. In this case values 2,3,4 and the expected return would be 3. A=[2,3,4,1] B=[1,2,3,4] 我需要找出列表 A 的元素比列表 B 的相同元素出现的次数多。在这种情况下,值为 2,3, 4,预期回报为 3。

def count(a, b):
    muuttuja = 0    
    for i in range(0, len(a)-1):        
        if a[i] != b[i] and a[i] not in  b[:i]:
            muuttuja += 1            
            
    return muuttuja

I have tried this kind of solution but it is very slow to process lists that have great number of values.我已经尝试过这种解决方案,但是处理具有大量值的列表非常慢。 I would appreciate some suggestions for alternative methods of doing the same thing but more efficiently.我会很感激一些关于做同样事情但更有效的替代方法的建议。 Thank you!谢谢!

If both the lists have unique elements you can make a map of element (as key) and index (as value).如果两个列表都有唯一元素,您可以创建一个 map 元素(作为键)和索引(作为值)。 This can be achieved using dictionary in python. Since, dictionary uses only O(1) time for lookup.这可以使用 python 中的字典来实现。因为,字典仅使用 O(1) 时间进行查找。 This code will give a time complexity of O(n)此代码将给出 O(n) 的时间复杂度

A=[2,3,4,1] 
B=[1,2,3,4]
d = {}
count = 0
for i,ele in enumerate(A) :
    d[ele] = i
for i,ele in enumerate(B) :
    if i > d[ele] :
        count+=1

Use a set of already seen B-values.使用一已经看到的 B 值。

def count(A, B):
    result = 0
    seen = set()
    for a, b in zip(A, B):
        seen.add(b)
        if a not in seen:
            result += 1
    return result

You can make a prefix-count of A, which is an array where for each index you keep track of the number of occurrences of each element before the index.您可以创建 A 的前缀计数,这是一个数组,您可以在其中为每个索引跟踪索引之前每个元素的出现次数。

You can use this to efficiently look-up the prefix-counts when looping over B:在遍历 B 时,您可以使用它来有效地查找前缀计数:

import collections

A=[2,3,4,1]
B=[1,2,3,4]

prefix_count = [collections.defaultdict(int) for _ in range(len(A))]
prefix_count[0][A[0]] += 1
for i, n in enumerate(A[1:], start=1):
    prefix_count[i] = collections.defaultdict(int, prefix_count[i-1])
    prefix_count[i][n] += 1

prefix_count_b = sum(prefix_count[i][n] for i, n in enumerate(B))
print(prefix_count_b)

This outputs 3.这输出 3。

This still could be O(N N) because of the copy from the previous index when initializing the prefix_count array, if someone knows a better way to do this, please let me know*这仍然可能是 O(N N),因为在初始化 prefix_count 数组时从前一个索引复制,如果有人知道更好的方法,请告诉我*

This only works if the values in your lists are immutable.这仅适用于列表中的值不可变的情况。


Your method is slow because it has a time complexity of O(N²): checking if an element exists in a list of length N is O(N), and you do this N times.您的方法很慢,因为它的时间复杂度为 O(N²):检查一个元素是否存在于长度为 N 的列表中是 O(N),并且您执行了 N 次。 We can do better by using up some more memory instead of time.我们可以通过使用更多 memory 而不是时间来做得更好。

First, iterate over b and create a dictionary mapping the values to the first index that value occurs at:首先,遍历b并创建一个字典,将值映射到值出现的第一个索引:

b_map = {}
for index, value in enumerate(b):
    if value not in b_map:
        b_map[value] = index

b_map is now {1: 0, 2: 1, 3: 2, 4: 3} b_map现在是{1: 0, 2: 1, 3: 2, 4: 3}

Next, iterate over a , counting how many elements have an index less than that element's value in the dictionary we just created:接下来,迭代a ,计算有多少元素的索引小于我们刚刚创建的字典中该元素的值:

result = 0
for index, value in enumerate(a):
    if index < b_map.get(value, -1):
        result += 1

Which gives the expected result of 3 .这给出了3的预期result

b_map.get(value, -1) is used to protect against the situation when a value in a doesn't occur in b , and you don't want to count it towards the total: .get returns the default value of -1 , which is guaranteed to be less than any index . b_map.get(value, -1)用于防止 a 中a值不出现在b中的情况,并且您不想将其计入总数: .get返回默认值-1 ,保证小于任何index If you do want to count it, you can replace the -1 with len(a) .如果您确实想计算它,可以将-1替换为len(a)

The second snippet can be replaced by a single call to sum :第二个代码片段可以由对sum的单个调用替换:

result = sum(index < b_map.get(value, -1) 
             for index, value in enumerate(a))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM