简体   繁体   中英

Efficient method to find what elements of Array A are in Array B

In my particle case I need a JavaScript code that find what strings of Array A are in Array B.

But it's more interesting to ask the general question:

Given Array A with length of N and array B , return a Boolean[] array of length N such that Boolean[i] will be true iff A[i] item is in B .

What is the most efficient solution to that problem? Both of the array are unsorted and array B can be empty and it can be larger or smaller then array A .

将数组B转换为哈希集(或其他快速搜索集合),然后对数组A的每个元素进行循环,并检查该元素是否存在于集合中。

The Tosters's answer is perfect for random inputs. However since you asked generic algorithm, I thought of following approach.

You need to search one element in other, so sorting one of them is required for optimal solution. Lets consider both cases. For simplicity lets denote their sizes with same letters. I mentioned time complexity of step in brackets after description of step.

Sorting B:

1. Sort B O(B log B).
2. Iterate over each element of A and check if element exists in B, O(A log B)
Time complexity: O(A log B) + O(B log B) = O((A+B) log B)

Sorting A:

1. Sort array A also maintaining original position in sorted array, O(A log A). 
2. Iterate on each element of B and find all A's containing B. Now iterate over each searched element to get original index and update Boolean array. Also keep a marker in sorted A that the element has been searched to avoid searching if same number appears again. O(B log A)
Time complexity: O(A log A) + O(B log A) = O((A+B) log A)

As a result, the solutions differ by factor of log of size of arrays. So for best solution, sort the array with lesser size and use corresponding algorithm.

There are a few different ways to do this with varying complexities.

First, you could do a nested loop over both lists and print out elements from A when you find a corresponding solution in B . This is O(ab) time and O(1) space.

Second, you could sort both lists and then walk through the lists in lockstep looking for matches (imagine the merge operation in mergesort). This is O(aloga + blogb) time and either O(1) or O(a+b) space, depending on whether you want to sort in-place or not.

Third, you could sort just the second list and then binary search for elements in A one at a time. This is O(blogb + alogb) time and O(1) or O(b) space, again depending on whether you sort in-place or not.

The Tosters proposed a solution whereby you create a hash set and do repeated queries on it. Depending on how the hash set resolves collisions, this is asymptotically equivalent in the worst case to either the first or third solutions (the underlying buckets can be lists or binary search trees).

在JS中这样做非常容易,我甚至不称它为算法:)

A.map(x => B.includes(x));

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM