[英]Subtracting Arrays
What's a fastest way to implement array subtraction? 什么是实现数组减法的最快方法? For example: 例如:
array a1 = [1, 3, 4, 5, 8];
array a2 = [2, 4, 5];
array a3 = a1 - a2; /* [1, 3, 8] */
Here array
would be the type my program uses to represent a struct which is used as a container. 这里的array
是我的程序用来表示用作容器的结构的类型。 The rest of it is pseudo code, of course I'm not creating the arrays like that nor subtracting. 其余部分是伪代码,当然我不会创建像这样的数组也不会减去。
The simplest solution I can think of involves nested loops: 我能想到的最简单的解决方案涉及嵌套循环:
/* a1 - a2 */
for (i = 0; i < a1.size; ++i) {
int is_the_same = 0;
for (j = 0; i < a2.size; ++j)
if (a1[i] == a2[j]) {
is_the_same = 1;
break;
}
}
if (!is_the_same)
a3.push a1[i];
}
But this does not look very efficient. 但这看起来效率不高。 What would be another approach? 另一种方法是什么?
If your arrays aren't sorted, the worst case time complexity for an array exclusion using a intuitive solution is O(n 2 ) (although you can boost this if you sort the arrays first), since you need to check the whole array whether an element is existent or not. 如果您的数组没有排序,使用直观解决方案排除数组的最坏情况时间复杂度为O(n 2 )(尽管如果您先对数组进行排序可以提高此值),因为您需要检查整个数组是否元素是否存在。
Example of worst case scenario: 最坏情况的例子:
array a1 = [1, 3, 4, 5, 8];
array a2 = [8, 5, 4, 3, 1];
If your arrays are ordered, then the worst case time complexity is O(n+m) (pseudo-code): 如果你的数组是有序的,那么最坏的情况时间复杂度是O(n + m)(伪代码):
int i = 0;
for(int j = 0; i < a1.size && j < a2.size;){
if(a1[i] == a2[j])
++i, ++j; // exclude this element
if(a1[i] < a2[j]){
a3.push(a1[i]); // include this element
++i;
}
if(a1[i] > a2[j])
++j; // ignore lesser elements
}
while(i < a1.size)
a3.push(a1[i]);
UPDATE -Wall -Wextra -pedantic
C code: UPDATE -Wall -Wextra -pedantic
C代码:
#include <stdio.h>
#include <malloc.h>
/**
* The following function excludes values from an array using another arrays values.
* Note that this version won't exclude multiple values, for this you have to drop
* '++j' in line 25.
*
* \param[in] from Original sorted array
* \param[in] from_length Original array length
* \param[in] what Sorted array including the excluding values
* \param[in] what_length self describing
* \param[out] result_length the lenght of the new array - a value lesser 0 indicates an error.
*/
int* exclude(int* from, int from_length, int* what, int what_length, int* result_length){
int i,j,k;
int* result = (int*) malloc(sizeof(int)*from_length);
if(result == NULL){
*result_length = -1;
return NULL;
}
for(i = j = k = 0; i < from_length && j < what_length;){
if(from[i] == what[j])
++i, ++j; /* exclude this element - to enable multiple exclusion drop '++j'
4,4,5,6 /4 --> 5,6 */
if(from[i] < what[j])
result[k++] = from[i++];
if(from[i] > what[j])
++j; /* ignore lesser elements */
}
while(i < from_length)
result[k++] = from[i++];
if( k < from_length){
int* tmp = (int*) realloc(result,sizeof(int)*k);
if(tmp == NULL){
/* either error handling or returning result */
}else{
result = tmp;
}
}
*result_length = k;
return result;
}
int main(){
int a[6] = {1,2,3,4,5,6};
int b[3] = {2,4,5};
int result_length;
int i;
int *c = exclude(a,6,b,3,&result_length);
for(i = 0; i < result_length; ++i)
printf("%i ",c[i]);
free(c);
return 0;
}
This will result in a worst time complexity of O(n+m)
for sorted arrays and O(n log n + m log m)
for non-sorted arrays (sort both, use the function provided above). 这将导致排序数组的O(n+m)
的最差时间复杂度和非排序数组的O(n log n + m log m)
(使用上面提供的函数进行排序)。
it can be done in O(nlogm + m)
where m
is the array you are subtracting from, using binary search (*)If the array is not sorted, a sort will be needed first which will result in O(mlogm + nlogm + m)
它可以在O(nlogm + m)
中完成,其中m
是你要减去的数组,使用二进制搜索 (*)如果数组没有排序,首先需要排序,这将导致O(mlogm + nlogm + m)
Pseudo code: 伪代码:
remove(a1,a2): //a1-a2
for each element x in a2:
i <- binarySearch(a1,x)
if x is in a1:
a1[x] <- NOT_VALID
remove all elements in a1 marked NOT_VALID
(*) You will have to give NOT_VALID
a special value for binary search to keep working, or even simpler: maintain a new array of elements marked as NOT_VALID
instead of actually marking elements. (*)您必须为NOT_VALID
提供二进制搜索的特殊值以保持工作,甚至更简单:维护一个标记为NOT_VALID
的新元素数组,而不是实际标记元素。
Because, you asked for the fastest and simplest I'm going to introduce some assumptions: 因为,你要求最快 最简单的我要介绍一些假设:
eg you have no more than 10 numbers. 例如,你的号码不超过10个。 Then let's treat them as sets for an O(n) solution (where n represents the maximum finite size of the set): 然后让我们将它们视为O(n)解的集合(其中n表示集合的最大有限大小):
// Initialize array1 to [1, 3, 4, 5, 8].
unsigned char array1[10];
memset(array1, 0, 10);
array1[1] = 1;
array1[3] = 1;
array1[4] = 1;
array1[5] = 1;
array1[8] = 1;
// Initialize array2 to [2,4,5].
unsigned char array2[10];
memset(array2, 0, 10);
array2[2] = 1;
array2[4] = 1;
array2[5] = 1;
// Implement array3 = array1 - array2.
unsigned char array3[10];
memset(array3, 0, 10);
for (int i = 0; i < 10; i++)
array3[i] = array1[i] & ~array2[i];
For an even more cheekier answer, if the numbers in your array do not exceed 0-31, you can just simplify the above using unsigned int : 为了更加诙谐的答案,如果数组中的数字不超过0-31,您可以使用unsigned int简化上述操作:
// array1 = 1, 3, 4, 5, 8
unsigned int array1 = (1 << 1) | (1 << 3) | (1 << 4) | (1 << 5) | (1 << 8);
// array2 = 2, 4, 5
unsigned int array2 = (1 << 2) | (1 << 4) | (1 << 5);
// array3 = array1 - array2;
unsigned int array3 = array1 &~ array2;
If a1 does not contain duplicates then you could use a hash set data structure, eg from pblSet . 如果a1不包含重复项,那么您可以使用哈希集数据结构,例如来自pblSet 。 Something like this: 像这样的东西:
PblSet* pSet = pblSetNewHashSet();
pblSetAddAll(pSet, a1);
pblSetRemoveAll(pSet, a2);
int** ppResult = (int**) pblSetToArray(pSet);
// use *ppResult
...
free(ppResult);
pblSetFree(pSet);
The performance should be O(n + m) and the arrays don't need to be sorted. 性能应为O(n + m),并且不需要对数组进行排序。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.