简体   繁体   English

C - 从数组中删除重复项

[英]C - Remove Duplicates from an Array

I'm quite new to programming, I wrote a code to remove duplicates from an array, logically, it should work, but it doesn't.... I logically tested it multiple times and it made sense...我对编程很陌生,我写了一个代码来从数组中删除重复项,从逻辑上讲,它应该可以工作,但它没有......我在逻辑上对其进行了多次测试,这是有道理的......

Here's the code:这是代码:

#include <stdio.h>

int rmDuplicates(int arr[], int n)
{
    int i, j;
    for (i = 0; i < n; i++) {
        if (arr[i] == arr[i + 1]) {
            for (j = i + 1; j < n - 1; j++) {
                arr[j] = arr[j + 1];
            }
            n--;
        }
        return n;
    }
}

int main()
{
    int n, i;
    scanf("%d", &n);
    int arr[n];
    for (i = 0; i < n; i++) {
        scanf("%d", &arr[i]);
    }

    n = rmDuplicates(arr, n);
    for (i = 0; i < n; i++) {
        printf("%d", arr[i]);
    }
    printf("\n%d", n);
    return 0;
}

Your "return n" is in the wrong place, and returns after the first cycle.您的“返回 n”位置错误,并在第一个循环后返回。

for(i=0;i<n;i++) {
    if(arr[i] == arr[i+1]) {
        for(j=i+1;j<n-1;j++) {
            arr[j] = arr[j+1];                              
        }
        n--;
    }
    return n; // <---- this
}
// <-- should be here.

As confirmation, if I move the return n;作为确认,如果我移动return n; outside the loop, the code works.在循环之外,代码有效。 But it only removes consecutive duplicates, because you only check arr[i] against its consecutive, arr[i+1].但它只删除连续的重复项,因为您只检查 arr[i] 与其连续的 arr[i+1]。

(Also, the cycle ought to stop at n-1, because otherwise arr[n-1+1] is arr[n] which is outside the array). (此外,循环应该在 n-1 处停止,否则 arr[n-1+1] 是数组外的 arr[n])。

A final issue is that if you have, say,最后一个问题是,如果你有,比如说,

                   n
 ...5,..., 5, 5, 6
    i      j

and you check the first 5 against the second, and find it a duplicate, then shift all that follows by one step, in the j-th position you will have a 5 again, but j will now be incremented and you will test the first 5 against the 6 instead of the third 5, not finding the duplicate:然后你检查前 5 和第二个,发现它是重复的,然后将后面的所有移动一步,在第 j 个位置你将再次有一个 5,但现在 j 将增加,你将测试第一个5 对 6 而不是第三个 5,没有找到重复项:

                n
 ...5,..., 5, 6
    i         j

For this reason, when you find a match, you need to rewind j by one and repeat that test:因此,当您找到匹配项时,您需要将 j 倒退一个并重复该测试:

int rmDuplicates(int arr[], int n) {
    int i,j,k;
    for (i=0;i<n-1;i++) {
        for (j=i+1; j < n; j++) {
                if(arr[i] == arr[j]) {
                    n--;
                    for (k=j;k<n;k++) {
                        arr[k] = arr[k+1];
                    }
                    j--;
                }
        }
    }
    return n;
}

From a performance point of view, the above algorithm is O(n^2), that is, if the array list doubles, the algorithm takes four times as long;从性能上看,上面的算法是O(n^2),即如果数组列表翻倍,算法的耗时是原来的四倍; if it trebles, it takes nine times as long.如果它是三倍,则需要九倍的时间。

A better algorithm would therefore be to first sort the array in-place, so that 1 3 2 7 2 3 5 becomes 1 2 2 3 3 5 7 (this has a cost of O(n log n), which grows more slowly );因此,更好的算法是首先对数组进行就地排序,以便1 3 2 7 2 3 5变为1 2 2 3 3 5 7 (这有 O(n log n) 的成本,增长更慢) ; then you just "compress" the array skipping duplicates, which is O(n) and gets you 1 2 3 5 7然后您只需“压缩”数组跳过重复项,即 O(n) 并得到1 2 3 5 7

int i, j;
for (i = 0, j = 1; j < n;) {
    if (arr[i] == arr[j]) {
        j++;
        continue;
    }
    i++;
    if (j != (i+1)) {
        arr[i] = arr[j];
    }
    j++;
}
n = i+1;
size_t removeDups(int *arr, size_t size)
{
    if(arr && size > 1)
    {
        for(size_t current = 0; current < size - 1; current++)
        {
            size_t original_size = size;
            size_t copypos = current + 1;
            for(size_t cpos = current + 1; cpos < original_size; cpos++)
            {
                if(arr[current] == arr[cpos])
                {
                    if(cpos < original_size -1)
                    {
                        if(arr[current] != arr[cpos + 1])
                        {
                            arr[copypos++] = arr[cpos + 1];
                            cpos++;
                        }
                    }
                    size--;
                }
                else
                {
                    arr[copypos++] = arr[cpos];
                }
           }
        }
    }
    return size;
}

int main(void)
{
    int arr[] = {1,1,1,2,2,3,3,4,5,6,7,1,8,8,2,2,2,2};
    size_t size = sizeof(arr) / sizeof(arr[0]);

    size = removeDups(arr, size);
    for(size_t index = 0; index < size; index++)
    {
        printf("%d\n", arr[index]);
    }
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM