简体   繁体   English

如何获得两个数组的交集

[英]How to get intersection of two Arrays

I have two integer arrays 我有两个整数数组

    int A[] = {2, 4, 3, 5, 6, 7};
    int B[] = {9, 2, 7, 6};

And i have to get intersection of these array. 而且我必须得到这些数组的交集。

ie output will be - 2,6,7 即输出将是-2,6,7

I am thinking to sove it by saving array A in a data strcture and then i want to compare all the element till size A or B and then i will get intersection. 我想通过将数组A保存在数据结构中来解决问题,然后我想比较所有元素,直到大小A或B,然后我将得到交集。

Now i have a problem i need to first store the element of Array A in a container. 现在我有一个问题,我需要先将数组A的元素存储在容器中。

shall i follow like - 我应该遵循-

int size = sizeof(A)/sizeof(int);

To get the size but by doing this i will get size after that i want to access all the elemts too and store in a container. 为了获得尺寸,但是这样做,我之后将获得尺寸,我也想访问所有元素并存储在容器中。

Here i the code which i am using to find Intersection -> 在这里,我用来查找交叉点的代码->

#include"iostream"

using namespace std;


int A[] = {2, 4, 3, 5, 6, 7};
int B[] = {9, 2, 7, 6};

int main()
{
    int sizeA = sizeof(A)/sizeof(int);

    int sizeB = sizeof(B)/sizeof(int);

    int big =  (sizeA > sizeB) ? sizeA : sizeB;

    int small =  (sizeA > sizeB) ? sizeB : sizeA;

    for (int i = 0; i <big ;++i)
    {
        for (int j = 0; j <small ; ++j)
        {
            if(A[i] == B[j])
            {
                cout<<"Element is -->"<<A[i]<<endl;
            }
        }
    }

    return 0;
}

Just use a hash table : 只需使用哈希表

#include <unordered_set>  // needs C++11 or TR1 
// ...
unordered_set<int> setOfA(A, A + sizeA);

Then you can just check for every element in B , whether it's also in A : 然后,您可以检查B每个元素,是否也在A

for (int i = 0; i < sizeB; ++i) {
    if (setOfA.find(B[i]) != setOfA.end()) {
        cout << B[i] << endl;
    }
}

Runtime is expected O(sizeA + sizeB) . 运行时应O(sizeA + sizeB)

saving array A in a data strcture 将数组A保存在数据结构中

Arrays are data structures; 数组数据结构; there's no need to save A into one. 无需将A存为一个。

i want to compare all the element till size A or B and then i will get intersection 我想比较所有元素,直到大小A或B,然后我将得到交集

This is extremely vague but isn't likely to yield the intersection; 这是非常模糊的,但不太可能产生交集。 notice that you must examine every element in both A and B but "till size A or B" will ignore elements. 请注意,您必须检查A和B中的每个元素,但是“耕种尺寸A或B”将忽略这些元素。

What approach i should follow to get size of an unkown size array and store it in a container?? 我应该采取什么方法来获取未知大小数组的大小并将其存储在容器中?

It isn't possible to deal with arrays of unknown size in C unless they have some end-of-array sentinel that allows counting the number of elements (as is the case with NUL-terminated character arrays, commonly referred to in C as "strings"). 除非它们具有一些可以计数元素数量的数组结尾标记(如NUL终止的字符数组,通常在C中称为“字符串”)。 However, the sizes of your arrays are known because their compile-time sizes are known. 但是,数组的大小是已知的,因为它们的编译时大小是已知的。 You can calculate the number of elements in such arrays with a macro: 您可以使用宏来计算此类数组中的元素数:

#define ARRAY_ELEMENT_COUNT(a) (sizeof(a)/sizeof *(a))

... ...

int *ptr = new sizeof(A); int * ptr = new sizeof(A);

[Your question was originally tagged [C], and my comments below refer to that] [您的问题最初被标记为[C],我在下面的评论中提及该问题]

This isn't valid C -- new is a C++ keyword. 这不是有效的C - new是一个C ++关键字。

If you wanted to make copies of your arrays, you could simply do it with, eg, 如果您想复制阵列,可以简单地用例如

int Acopy[ARRAY_ELEMENT_COUNT(A)];
memcpy(Acopy, A, sizeof A);

or, if for some reason you want to put the copy on the heap, 或者,如果出于某种原因要将副本放在堆上,

int* pa = malloc(sizeof A);
if (!pa) /* handle out-of-memory */
memcpy(pa, A, sizeof A);
/* After you're done using pa: */
free(pa);

[In C++ you would used new and delete ] [在C ++中,您将使用newdelete ]

However, there's no need to make copies of your arrays in order to find the intersection, unless you need to sort them (see below) but also need to preserve the original order. 但是,除非需要对它们进行排序(请参阅下文),而且还需要保留原始顺序,否则无需为查找交集而复制数组的副本。

There are a few ways to find the intersection of two arrays. 有几种方法可以找到两个数组的交集。 If the values fall within the range of 0-63, you can use two unsigned long s and set the bits corresponding to the values in each array, then use & (bitwise "and") to find the intersection. 如果值在0-63范围内,则可以使用两个unsigned long s并设置与每个数组中的值相对应的位,然后使用& (按位“与”)查找交集。 If the values aren't in that range but the difference between the largest and smallest is < 64, you can use the same method but subtract the smallest value from each value to get the bit number. 如果值不在此范围内,但是最大和最小之间的差小于64,则可以使用相同的方法,但从每个值中减去最小值以获得位数。 If the range is not that small but the number of distinct values is <= 64, you can maintain a lookup table (array, binary tree, hash table, etc.) that maps the values to bit numbers and a 64-element array that maps bit numbers back to values. 如果范围不是那么小,但是不同值的数量小于等于64,则可以维护一个查找表(数组,二叉树,哈希表等),该表将这些值映射到位数和一个由64个元素组成的数组将位号映射回值。

If your arrays may contain more than 64 distinct values, there are two effective approaches: 如果您的数组可能包含超过64个不同的值,则有两种有效的方法:

1) Sort each array and then compare them element by element to find the common values -- this algorithm resembles a merge sort. 1)对每个数组进行排序,然后逐个元素比较它们以找到公用值-此算法类似于合并排序。

2) Insert the elements of one array into a fast lookup table (hash table, balanced binary tree, etc.), and then look up each element of the other array in the lookup table. 2)将一个数组的元素插入快速查找表(哈希表,平衡二叉树等)中,然后在查找表中查找另一个数组的每个元素。

Sort both arrays (eg, qsort() ) and then walk through both arrays one element at a time. 对两个数组进行排序(例如qsort() ),然后一次遍历两个数组一个元素。

Where there is a match, add it to a third array, which is sized to match the larger of the two input arrays (your result array can be no larger than the largest of the two arrays). 在存在匹配项的地方,将其添加到第三个数组,该数组的大小应与两个输入数组中的较大数组匹配(您的结果数组不能大于两个数组中的最大数组)。 Use a negative or other "dummy" value as your terminator. 使用负值或其他“虚拟”值作为终止符。

When walking through input arrays, where one value in the first array is larger than the other, move the index of the second array, and vice versa. 当遍历输入数组时,第一个数组中的一个值大于另一个数组,请移动第二个数组的索引,反之亦然。

When you're done walking through both arrays, your third array has your answer, up to the terminator value. 当您完成两个数组的遍历时,您的第三个数组将给出答案,直至终止符值。

You can sort the two arrays 您可以对两个数组进行排序

sort(A, A+sizeA);
sort(B, B+sizeB);

and use a merge-like algorithm to find their intersection: 并使用类似合并的算法找到它们的交集:

#include <vector>

...

std::vector<int> intersection;
int idA=0, idB=0;

while(idA < sizeA && idB < sizeB) {
    if (A[idA] < B[idB]) idA ++;
    else if (B[idB] < A[idA]) idB ++;
    else { // => A[idA] = B[idB], we have a common element
        intersection.push_back(A[idA]);
        idA ++;
        idB ++;
    }
}

The time complexity of this part of the code is linear. 这部分代码的时间复杂度是线性的。 However, due to the sorting of the arrays, the overall complexity becomes O(n * log n), where n = max(sizeA, sizeB). 但是,由于数组的排序,总体复杂度变为O(n * log n),其中n = max(sizeA,sizeB)。

The additional memory required for this algorithm is optimal (equal to the size of the intersection). 此算法所需的额外内存是最佳的(等于交叉点的大小)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM