在线性时间内从数组中删除重复项，无需额外 arrays

Question

We have an array and it is unsorted.我们有一个数组，它是未排序的。 We know the range is [0,n].我们知道范围是 [0,n]。

We want to remove duplicates but we cannot use extra arrays and it must run in linear time .我们想删除重复但我们不能使用额外的 arrays并且它必须以线性时间运行。

Any ideas?有任何想法吗？ Just to clarify, this is not for homework!澄清一下，这不是作业！

Answer 1

If the integers are limited 0 to n, you can move through the array, placing numbers by their indices.如果整数限制为 0 到 n，您可以在数组中移动，按索引放置数字。 Every time you replace a number, take the value that used to be there and move it to where it should be.每次替换一个数字时，取过去存在的值并将其移动到应有的位置。 For instance, let's say we have an array of size 8:例如，假设我们有一个大小为 8 的数组：

-----------------
|3|6|3|4|5|1|7|7|
-----------------
 S

Where S is our starting point, and we'll use C to keep track of our "current" index below.其中 S 是我们的起点，我们将使用 C 来跟踪下面的“当前”索引。 We start with index 0, and move 3 to the 3 index spot, where 4 is.我们从索引 0 开始，然后将 3 移动到 3 索引位置，即 4 所在的位置。 Save 4 in a temp var.在临时变量中保存 4。

-----------------
|X|6|3|3|5|1|7|7|   Saved 4 
-----------------  
 S     C

We then put 4 in the index 4, saving what used to be there, 5.然后我们将 4 放入索引 4 中，保存以前存在的 5。

-----------------
|X|6|3|3|4|1|7|7|   Saved 5
-----------------
 S       C

Keep going继续

-----------------
|X|6|3|3|4|5|7|7|   Saved 1
-----------------
 S         C

-----------------
|X|1|3|3|4|5|7|7|   Saved 6
-----------------
 S C

-----------------
|X|1|3|3|4|5|6|7|   Saved 7    
-----------------
 S           C

When we try to replace 7, we see a conflict, so we simply don't place it.当我们尝试替换 7 时，我们看到了冲突，所以我们根本不放置它。 We then continue from the starting index S, increment it by 1:然后我们从起始索引 S 继续，将其增加 1：

-----------------
|X|1|3|3|4|5|6|7| 
-----------------  
   S

1 is fine here, 3 needs to move 1在这里很好，3需要移动

-----------------
|X|1|X|3|4|5|6|7|
-----------------
     S

But 3 is a duplicate, so we throw it away and keep iterating through the rest of the array.但是 3 是重复的，因此我们将其丢弃并继续遍历数组的其余部分。

So basically, we move each entry at most 1 time, and iterate through the entire array.所以基本上，我们最多移动每个条目 1 次，并遍历整个数组。 That's O(2n) = O(n)那是 O(2n) = O(n)

Answer 2

Assume int a[n] is an array of integers in the range [0,n-1].假设int a[n]是 [0,n-1] 范围内的整数数组。 Note that this differs slightly from the stated problem, but I make this assumption to make clear how the algorithm works.请注意，这与所述问题略有不同，但我做出此假设是为了明确算法的工作原理。 The algorithm can be patched up to work for integers in the range [0,n].该算法可以修补以适用于 [0,n] 范围内的整数。

for (int i=0; i<n; i++)
{
    if (a[i] != i)
    {
         j = a[i];
         k = a[j];
         a[j] = j;  // Swap a[j] and a[i]
         a[i] = k;
     }
 }

 for (int i=0; i<n; i++)
 {
     if (a[i] == i)
     {
        printf("%d\n", i);
     }
 }

Answer 3

    void printRepeating(int arr[], int size)
{
  int i;
  printf("The repeating elements are: \n");
  for(i = 0; i < size; i++)
  {
    if(arr[abs(arr[i])] >= 0)
      arr[abs(arr[i])] = -arr[abs(arr[i])];
    else
      printf(" %d ", abs(arr[i]));
  }
}

Answer 4

Walk through the array assign array[array[i]] = -array[array[i]];遍历数组assign array[array[i]] = -array[array[i]]; if not negative;如果不是负数； if its already negative then its duplicate, this will work since all values are within 0 and n.如果它已经是负数，那么它是重复的，这将起作用，因为所有值都在 0 和 n 之内。

Answer 5

Extending @Joel Lee's code for completion.扩展@Joel Lee 的代码以完成。

#include <iostream>
void remove_duplicates(int *a, int size)
{
  int i, j, k;
  bool swap = true;

   while(swap){
    swap = false;
    for (i=0; i<size; i++){
        if(a[i] != i && a[i] != a[a[i]]){
            j = a[i];
            k = a[j];
            a[i] = k;
            a[j] = j;
            swap = true;      
        }

    }
    }
}

int main()
{
    int i;
    //int array[8] = {3,6,3,4,5,1,7,7};
    int array[8] = {7,4,6,3,5,4,6,2};

    remove_duplicates(array, sizeof(array)/sizeof(int));

    for (int i=0; i<8; i++)
        if(array[i] == i)
            std::cout << array[i] << " ";

    return 0;
}

Answer 6

Can you sort?你能排序吗？ Sort with Radix Sort - http://en.wikipedia.org/wiki/Radix_sort with complexity O(arraySize) for given case and then remove duplicates from sorted array O(arraySize).使用基数排序进行排序 - http://en.wikipedia.org/wiki/Radix_sort对于给定的情况，复杂度为 O(arraySize)，然后从排序数组 O(arraySize) 中删除重复项。

Answer 7

With ES6 I think this can be solved with only a few lines reducing the array into an object and then using object.keys to get array without duplicates.使用 ES6，我认为这可以解决，只需几行将数组减少为一个对象，然后使用 object.keys 获取没有重复的数组。 This probably takes more memory.这可能需要更多内存。 I'm not sure.我不知道。

I did it like this:我是这样做的：

var obj = array.reduce(function (acc, elem) {
      acc[elem] = true;
      return acc;
    },{});
var uniqueArray = Object.keys(obj);

This has the added bonus (or disadvantage) of sorting the array.这具有对数组进行排序的额外好处（或缺点）。 It works with strings too.它也适用于字符串。

Answer 8

Use the array aa container with negative sign as an indicator, this will corrupt the input though.使用带有负号的数组 aa 容器作为指示符，但这会破坏输入。

在线性时间内从数组中删除重复项，无需额外 arrays

问题描述

8 个解决方案

解决方案1
12 已采纳 2011-03-24 05:23:00

解决方案2
3 2011-03-24 05:27:06

解决方案3
3 2011-10-02 06:16:54

解决方案4
0 2012-07-02 05:11:04

解决方案5
0 2016-09-17 21:16:49

解决方案6
0 2011-03-24 05:24:30

解决方案7
0 2020-08-06 09:18:00

解决方案8
0 2022-02-13 19:41:31

在线性时间内从数组中删除重复项，无需额外 arrays

问题描述

8 个解决方案

解决方案1 12 已采纳 2011-03-24 05:23:00

解决方案2 3 2011-03-24 05:27:06

解决方案3 3 2011-10-02 06:16:54

解决方案4 0 2012-07-02 05:11:04

解决方案5 0 2016-09-17 21:16:49

解决方案6 0 2011-03-24 05:24:30

解决方案7 0 2020-08-06 09:18:00

解决方案8 0 2022-02-13 19:41:31

解决方案1
12 已采纳 2011-03-24 05:23:00

解决方案2
3 2011-03-24 05:27:06

解决方案3
3 2011-10-02 06:16:54

解决方案4
0 2012-07-02 05:11:04

解决方案5
0 2016-09-17 21:16:49

解决方案6
0 2011-03-24 05:24:30

解决方案7
0 2020-08-06 09:18:00

解决方案8
0 2022-02-13 19:41:31