简体   繁体   English

如何使用Java中的辅助数组从列表中删除重复项?

[英]How to remove duplicates from a list using an auxiliary array in Java?

I am trying to remove duplicates from a list by creating a temporary array that stores the indices of where the duplicates are, and then copies off the original array into another temporary array while comparing the indices to the indices I have stored in my first temporary array. 我试图通过创建一个临时数组来删除列表中的重复项,该数组存储重复项所在的索引,然后将原始数组复制到另一个临时数组中,同时将索引与我存储在第一个临时数组中的索引进行比较。

public void removeDuplicates()
{
    double tempa [] = new double [items.length];
    int counter = 0;
    for ( int i = 0; i< numItems ; i++)
    {
        for(int j = i + 1; j < numItems; j++)
        {
            if(items[i] ==items[j])
            {
                tempa[counter] = j;
                counter++;

            }
        }
    }

    double tempb [] = new double [ items.length];
    int counter2 = 0;
    int j =0;
    for(int i = 0; i < numItems; i++)
    {
        if(i != tempa[j])
        {
            tempb[counter2] = items[i];
            counter2++;

        }
        else
        {
            j++;

        }
    }

    items = tempb;
    numItems = counter2;
}

and while the logic seems right, my compiler is giving me an arrayindexoutofbounds error at 虽然逻辑看似正确,但我的编译器在给我一个arrayindexoutofbounds错误

tempa[counter] = j;

I don't understand how counter could grow to above the value of items.length, where is the logic flaw? 我不明白计数器如何增长到items.length的值以上,逻辑缺陷在哪里?

You are making things quite difficult for yourself. 你为自己制造的东西很难。 Let Java do the heavy lifting for you. 让Java为您做繁重的工作。 For example LinkedHashSet gives you uniqueness and retains insertion order. 例如,LinkedHashSet为您提供唯一性并保留插入顺序。 It will also be more efficient than comparing every value with every other value. 它比将每个值与每个其他值进行比较也更有效。

double [] input = {1,2,3,3,4,4};
Set<Double> tmp = new LinkedHashSet<Double>();
for (Double each : input) {
    tmp.add(each);
}
double [] output = new double[tmp.size()];
int i = 0;
for (Double each : tmp) {
    output[i++] = each;
}
System.out.println(Arrays.toString(output));

Done for int arrays, but easily coud be converted to double. 完成int数组,但很容易转换为double。

1) If you do not care about initial array elements order: 1)如果你不关心初始数组元素的顺序:

private static int[] withoutDuplicates(int[] a) {
    Arrays.sort(a);
    int hi = a.length - 1;
    int[] result = new int[a.length];
    int j = 0;
    for (int i = 0; i < hi; i++) {
        if (a[i] == a[i+1]) {
            continue;
        }
        result[j] = a[i];
        j++;            
    }
    result[j++] = a[hi];
    return Arrays.copyOf(result, j);
}

2) if you care about initial array elements order: 2)如果你关心初始数组元素的顺序:

private static int[] withoutDuplicates2(int[] a) {
    HashSet<Integer> keys = new HashSet<Integer>();
    int[] result = new int[a.length];
    int j = 0;
    for (int i = 0 ; i < a.length; i++) {
        if (keys.add(a[i])) {
            result[j] = a[i];
            j++;
        }
    }
    return Arrays.copyOf(result, j);
}

3) If you do not care about initial array elements order: 3)如果你不关心初始数组元素的顺序:

private static Object[] withoutDuplicates3(int[] a) {
    HashSet<Integer> keys = new HashSet<Integer>();
    for (int value : a) {
        keys.add(value);
    }
    return keys.toArray();
}

Imagine this was your input data: 想象一下这是你的输入数据:

Index: 0, 1, 2, 3, 4, 5, 6, 7, 8
Value: 1, 2, 3, 3, 3, 3, 3, 3, 3

Then according to your algorithm, tempa would need to be: 然后根据你的算法, tempa需要是:

Index: 0, 1, 2, 3, 4, 5, 6, 7, 8, ....Exception!!!
Value: 3, 4, 5, 6, 7, 8, 4, 5, 6, 7, 8, 5, 6, 7, 8, 6, 7, 8, 7, 8, 8

Why do you have this problem? 你为什么遇到这个问题? Because the first set of nested for loops does nothing to prevent you from trying to insert duplicates of the duplicate array indices! 因为第一组嵌套for循环不会阻止您尝试插入重复数组索引的重复项!

What is the best solution? 什么是最好的解决方案?

Use a Set! 使用套装! Sets guarantee that there are no duplicate entries in them. 设置保证其中没有重复的条目。 If you create a new Set and then add all of your array items to it, the Set will prune the duplicates. 如果您创建一个新的Set然后将所有数组项添加到它,Set将修剪重复项。 Then it is just a matter of going back from the Set to an array. 然后,这只是从Set回到数组的问题。

Alternatively, here is a very C-way of doing the same thing: 或者,这是一个非常C方式做同样的事情:

//duplicates will be a truth table indicating which indices are duplicates.
//initially all values are set to false
boolean duplicates[] = new boolean[items.length];
for ( int i = 0; i< numItems ; i++) {
    if (!duplicates[i]) { //if i is not a known duplicate
        for(int j = i + 1; j < numItems; j++) {
            if(items[i] ==items[j]) {
                duplicates[j] = true; //mark j as a known duplicate
            }
        }
    }
}

I leave it to you to figure out how to finish. 我留给你弄清楚如何完成。

import java.util.HashSet;

import sun.security.util.Length;


public class arrayduplication {
public static void main(String[] args) {
        int arr[]={1,5,1,2,5,2,10};
        TreeSet< Integer>set=new TreeSet<Integer>();
        for(int i=0;i<arr.length;i++){
            set.add(Integer.valueOf(arr[i]));
        }
        System.out.println(set);


    }

}

You have already used num_items to bound your loop. 您已经使用num_items来绑定循环。 Use that variable to set your array size for tempa also. 使用该变量也可以为tempa设置数组大小。

double tempa [] = new double [num_items];

Instead of doing it in array, you can simply use java.util.Set . 您可以使用java.util.Set而不是在数组中执行此操作。

Here an example: 这是一个例子:

public static void main(String[] args)
{
    Double[] values = new Double[]{ 1.0, 2.0, 2.0, 2.0, 3.0, 10.0, 10.0 };
    Set<Double> singleValues = new HashSet<Double>();

    for (Double value : values)
    {
        singleValues.add(value);
    }
    System.out.println("singleValues: "+singleValues);
    // now convert it into double array
    Double[] dValues = singleValues.toArray(new Double[]{});
}

Here's another alternative without the use of sets, only primitive types: 这是另一种不使用集合的替代方法,只有原始类型:

public static double [] removeDuplicates(double arr[]) {
    double [] tempa = new double[arr.length];
    int uniqueCount = 0;
    for (int i=0;i<arr.length;i++) {
        boolean unique = true;
        for (int j=0;j<uniqueCount && unique;j++) {
            if (arr[i] == tempa[j]) {
                unique = false;
            }
        }
        if (unique) {
            tempa[uniqueCount++] = arr[i];
        }
    }

    return Arrays.copyOf(tempa,  uniqueCount);
}

It does require a temporary array of double objects on the way towards getting your actual result. 它确实需要一个临时的双重对象数组来获取实际结果。

您可以使用一组来删除倍数。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM