简体   繁体   中英

Finding the count of how many elements of list A appear before than in the similar but mixed list B

A=[2,3,4,1] B=[1,2,3,4] I need to find how many elements of list A appear before than the same element of list B. In this case values 2,3,4 and the expected return would be 3.

def count(a, b):
    muuttuja = 0    
    for i in range(0, len(a)-1):        
        if a[i] != b[i] and a[i] not in  b[:i]:
            muuttuja += 1            
            
    return muuttuja

I have tried this kind of solution but it is very slow to process lists that have great number of values. I would appreciate some suggestions for alternative methods of doing the same thing but more efficiently. Thank you!

If both the lists have unique elements you can make a map of element (as key) and index (as value). This can be achieved using dictionary in python. Since, dictionary uses only O(1) time for lookup. This code will give a time complexity of O(n)

A=[2,3,4,1] 
B=[1,2,3,4]
d = {}
count = 0
for i,ele in enumerate(A) :
    d[ele] = i
for i,ele in enumerate(B) :
    if i > d[ele] :
        count+=1

Use a set of already seen B-values.

def count(A, B):
    result = 0
    seen = set()
    for a, b in zip(A, B):
        seen.add(b)
        if a not in seen:
            result += 1
    return result

You can make a prefix-count of A, which is an array where for each index you keep track of the number of occurrences of each element before the index.

You can use this to efficiently look-up the prefix-counts when looping over B:

import collections

A=[2,3,4,1]
B=[1,2,3,4]

prefix_count = [collections.defaultdict(int) for _ in range(len(A))]
prefix_count[0][A[0]] += 1
for i, n in enumerate(A[1:], start=1):
    prefix_count[i] = collections.defaultdict(int, prefix_count[i-1])
    prefix_count[i][n] += 1

prefix_count_b = sum(prefix_count[i][n] for i, n in enumerate(B))
print(prefix_count_b)

This outputs 3.

This still could be O(N N) because of the copy from the previous index when initializing the prefix_count array, if someone knows a better way to do this, please let me know*

This only works if the values in your lists are immutable.


Your method is slow because it has a time complexity of O(N²): checking if an element exists in a list of length N is O(N), and you do this N times. We can do better by using up some more memory instead of time.

First, iterate over b and create a dictionary mapping the values to the first index that value occurs at:

b_map = {}
for index, value in enumerate(b):
    if value not in b_map:
        b_map[value] = index

b_map is now {1: 0, 2: 1, 3: 2, 4: 3}

Next, iterate over a , counting how many elements have an index less than that element's value in the dictionary we just created:

result = 0
for index, value in enumerate(a):
    if index < b_map.get(value, -1):
        result += 1

Which gives the expected result of 3 .

b_map.get(value, -1) is used to protect against the situation when a value in a doesn't occur in b , and you don't want to count it towards the total: .get returns the default value of -1 , which is guaranteed to be less than any index . If you do want to count it, you can replace the -1 with len(a) .

The second snippet can be replaced by a single call to sum :

result = sum(index < b_map.get(value, -1) 
             for index, value in enumerate(a))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM