简体   繁体   English

删除阵列重复项

[英]Remove array duplicates

I'm trying to remove duplicates from the array, but it is not working. 我正在尝试从阵列中删除重复项,但是它不起作用。

Am I missing something ? 我想念什么吗?

Code :- 代码:-

class RemoveStringDuplicates {

    public static char[] removeDups(char[] str) {
        boolean bin_hash[] = new boolean[256];
        int ip_ind = 0, res_ind = 0;
        char temp;

        while (ip_ind < str.length) {
            temp = str[ip_ind];
            if (bin_hash[temp] == false) {
                bin_hash[temp] = true;
                str[res_ind] = str[ip_ind];
                res_ind++;
            }
            ip_ind++;
        }

        return str;
    }

    public static void main(String[] args) {
        char str[] = "test string".toCharArray();
        System.out.println(removeDups(str));
    }
}

Output :- 输出:-

 tes ringing //ing should not have been repeated!

Instead of assigning the characters into the same array, you should use a new array. 不要将字符分配到同一数组中,而应使用新数组。 Because, after removing the duplicates, the trailing elements are not being removed, and thus are printed. 因为在删除重复项之后,尾部元素不会被删除,因此将其打印出来。

So, if you use a new array, the trailing elements would be null characters. 因此,如果使用新数组,则尾随元素将为null字符。

So, just create an new array: 因此,只需创建一个新数组:

char[] unique = new char[str.length];

And then change the assignment: 然后更改分配:

str[res_ind] = str[ip_ind];

to: 至:

unique[res_ind] = str[ip_ind];

Also, you can consider using an ArrayList instead of an array . 另外,您可以考虑使用ArrayList而不是array That way you won't have to maintain a boolean array for each character, which is quite too much. 这样,您就不必为每个字符维护一个boolean数组,这太多了。 You are loosing some not-needed extra space. 您正在失去一些不必要的额外空间。 With an ArrayList , you can use the contains method to check for the characters already added. 使用ArrayList ,可以使用contains方法检查已经添加的字符。

Well, you can also avoid doing all those counting stuffs manually, by using a Set , which automatically removes duplicates for you. 好的,您还可以避免使用Set来手动进行所有计数,这可以通过Set来自动为您删除重复项。 But most implementation does not maintain insertion order. 但是大多数实现并不维护插入顺序。 For that you can use LinkedHashSet . 为此,您可以使用LinkedHashSet

The specific problem has already found a solution, but if you are not restricited to using your own method and can use the java libraries, I would suggest something like this: 特定的问题已经找到了解决方案,但是如果您不拘泥于使用自己的方法并且可以使用Java库,则建议如下:

public class RemoveDuplicates {

// Note must wrap primitives for generics
// Generic array creation not supported by java, gotta return a list

public static <T> List<T> removeDuplicatesFromArray(T[] array) {
    Set<T> set = new LinkedHashSet<>(Arrays.asList(array));
    return new ArrayList<>(set);
}

public static void main(String[] args) {
    String s = "Helloo I am a string with duplicates";
    Character[] c = new Character[s.length()];

    for (int i = 0; i < s.length(); i++) {
        c[i] = s.charAt(i);
    }

    List<Character> noDuplicates = removeDuplicatesFromArray(c);
    Character[] noDuplicatesArray = new Character[noDuplicates.size()];
    noDuplicates.toArray(noDuplicatesArray);

    System.out.println("List:");
    System.out.println(noDuplicates);
    System.out.println("\nArray:");
    System.out.println(Arrays.toString(noDuplicatesArray));
}
}

Out: 出:

List:
[H, e, l, o,  , I, a, m, s, t, r, i, n, g, w, h, d, u, p, c]

Array:
[H, e, l, o,  , I, a, m, s, t, r, i, n, g, w, h, d, u, p, c]

The linkedhashset retains ordering, which might be especially important for things like characterarrays. 链接哈希集保留排序,这对于诸如字符数组之类的东西可能尤其重要。

Try This: 尝试这个:

public static char[] removeDups(char[] str) {
        boolean bin_hash[] = new boolean[256];
        int ip_ind = 0, res_ind = 0;
        char temp;
        char a[] = new char[str.length];

        while (ip_ind < str.length) {
            temp = str[ip_ind];
            if (bin_hash[temp] == false) {
                bin_hash[temp] = true;
                a[res_ind] = str[ip_ind];
                res_ind++;
            }
            ip_ind++;
        }

        return a;
    }

You basically are updating the str variable in the loop. 您基本上是在循环中更新str变量。 Updating it and again looping on the updated array. 更新它,然后再次循环到更新的数组。

I believe the problem is caused by the fact that you are iterating over str while you are modifying it (by the line str[res_ind] = str[ip_ind] ). 我相信问题是由于您在修改str时遍历了str (通过str[res_ind] = str[ip_ind] )引起的。 If you copy the result to another array, it works: 如果将结果复制到另一个数组,它将起作用:

class RemoveStringDuplicates {

    public static char[] removeDups(char[] str) {
        char result[] = new char[str.length];
        boolean bin_hash[] = new boolean[256];
        int ip_ind = 0, res_ind = 0;
        char temp;

        while (ip_ind < str.length) {
            temp = str[ip_ind];
            if (bin_hash[temp] == false) {
                bin_hash[temp] = true;
                result[res_ind] = str[ip_ind];
                res_ind++;
            }
            ip_ind++;
        }

        return result;
    }

    public static void main(String[] args) {
        char str[] = "test string".toCharArray();
        System.out.println(removeDups(str));
    }
}

All the other answers seem to be correct. 所有其他答案似乎都是正确的。 The "ing" that you see at the end of the result is actually untouched characters already in the array. 您在结果末尾看到的“ ing”实际上是数组中已有未触及的字符。

As an alternative solution (if you want to conserve memory), you can loop over the last part of the array to delete the characters at the end because you already know they are duplicate. 作为一种替代解决方案(如果要节省内存),可以循环遍历数组的最后一部分以删除结尾的字符,因为您已经知道它们是重复的。

//C# code, I think you just need to change str.Length here to str.length
for (int delChars = res_ind; delChars < str.Length; delChars++)
{
    str[delChars] = '\0';
}

You are totally abusing the Java language with your code. 您完全滥用Java语言和代码。 The data structure classes in the standard libraries are the main point of using Java . 标准库中的数据结构类是使用Java的重点 Use them. 使用它们。

The correct way to code something to do what you want is here: 编写某些代码以执行所需操作的正确方法在这里:

class RemoveStringDuplicates {

    public static String removeDups(CharSequence str) {

        StringBuilder b = new StringBuilder(str);
        HashSet<Character> s = new HashSet<Character>();

        for(int idx = 0; idx < b.size(); idx++)
            if(mySet.contains(b.charAt(idx)))
                b.deleteCharAt(idx--);
            else
                s.add(ch);

        return b.toString();
    }

    public static void main(String[] args) {
        System.out.println(removeDups(str));
    }
}

There are probably even better ways of doing it, too. 也许还有更好的方法。 Don't go avoiding Java's data structures. 不要回避Java的数据结构。

If you are writing code that is performance-sensitive enough that you have to use primitive code like that in your question, you should be using a different language, like C. 如果编写的代码对性能非常敏感,则必须使用问题中的原始代码,则应该使用其他语言,例如C。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM