简体   繁体   English

遍历排序数组以删除重复项

[英]Iterating over a sorted array to remove duplicates

Let me start by saying this a homework question that I am having trouble with. 首先,我要说这是我遇到的家庭作业问题。

I have sorted an array and what I need to do is use another array to remove duplicates by iterating over the first and comparing adjacent items then adding the non duplicates to the new array. 我已经对数组进行了排序,我需要做的是使用另一个数组,通过迭代第一个数组并比较相邻的项,然后将非重复项添加到新数组中,以删除重复项。 Once I have finished that I set the old array = to the new array. 完成后,将旧的array =设置为新的数组。 I am not used to java and therefore I am running into some problems I think getting the iteration setup correctly. 我不习惯Java,因此遇到了一些问题,我认为正确设置了迭代设置。

public static void main(String[] args) {
    args = new String[] { "data/list1.txt" };
    StdIn.fromFile("data/list2.txt");
    // StdOut.toFile ("finished.txt");
    int[] whitelist = In.readInts(args[0]);

    Arrays.sort(whitelist);
    int newArray[] = new int[whitelist.length];
    for (int i = 0; i < whitelist.length-1; i++) {
        int k = 0;
        if(whitelist[i+1] > whitelist[i])
            newArray[k] = whitelist[i];
            k++;
        StdOut.println(java.util.Arrays.toString(whitelist));
        whitelist = newArray;
        }
    for (int i=0; i<newArray.length;i++){
        StdOut.println(java.util.Arrays.toString(newArray));
    }

This code piece is part of a larger binary search, but this is the portion I am having problems with. 该代码段是较大的二进制搜索的一部分,但这是我遇到的问题。

My output besides not having the duplicates removed also prints out several times. 我的输出除了没有删除重复项外,还会打印几次。

Any direction would be greatly appreciated. 任何方向将不胜感激。

With the limitations like not using collections, your code can be rewritten this way, and it will work: 由于没有使用集合等限制,因此可以用这种方式重写代码,并且可以正常工作:

    Arrays.sort(whitelist);
    int newArray[] = new int[whitelist.length];
    newArray[0] = whitelist[0];
    int k = 1;
    for (int i = 0; i < whitelist.length - 1; i++) {
        if(whitelist[i+1] > whitelist[i]) {
            newArray[k] = whitelist[i + 1];
            k++;
        }
    }
    newArray = Arrays.copyOf(newArray, k);
    whitelist = newArray;
    System.out.println(Arrays.toString(newArray));

The if only applies to the first instruction, k++ is incremented in each iteration. if仅适用于第一条指令,则k ++在每次迭代中都会递增。 You should use: 您应该使用:

if (whitelist[i+1] > whitelist[i]) {
        newArray[k] = whitelist[i];
        k++;
}

Also, in the first loop you are overwritting the whilelist with newArray even after the first operation, I think you meant to move this outside the for : 另外,在第一个循环中,即使在执行第一个操作之后,您whilelistnewArray覆盖了whilelist ,我想您打算将其移至for之外:

StdOut.println(java.util.Arrays.toString(whitelist));
whitelist = newArray;

它可以通过在纸上概述算法来帮助您开始任何工作,当您需要为采访编写代码时,它会有所帮助。

Generally speaking you should correctly define equals() and hashCode(), to define what the word "duplicate" means for your objects. 一般来说,您应该正确定义equals()和hashCode(),以定义“重复”一词对您的对象的含义。 But since you use primitives and wrappers (through boxing/unboxing) you don't have to do it here. 但是,由于您使用了原语和包装器(通过装箱/拆箱),因此无需在此处进行操作。

Then you should put your array into some Set collection. 然后,您应该将数组放入一些Set集合中。 All duplicates will be eliminated automatically. 所有重复项将自动消除。 After that put the Set back to Array. 之后,将Set放回Array。

The Java built-in mechanism will remove duplicates in the most optimized way. Java内置机制将以最优化的方式删除重复项。 You will not have to do it manually. 您将不必手动执行此操作。

    Integer[] whitelistI = null;
    Set set = new HashSet(Arrays.asList(whitelist));
    whitelistI = (Integer []) set.toArray(new Integer[set.size()]);

If you need an array of primitives you can copy it from whitelistI. 如果需要原始数组,可以从whitelistI复制它。

Moreover, this is wrong: 而且,这是错误的:

int newArray[] = new int[whitelist.length];

Your new array will be of the same length as the original one, but you said you want to delete duplicates. 新阵列的长度将与原始阵列的长度相同,但是您表示要删除重复的阵列。 If you delete duplicates its actual size becomes shorter and your new array will have empty values (in your case - 0 (zeros)). 如果删除重复项,则其实际大小会变短,新数组将具有空值(在您的情况下为0(零))。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM