简体   繁体   中英

Removing duplicates from a string

I am trying to remove duplicates from a String in Java. Here i what I have tried

public void unique(String s)
{
    // put your code here
    char[]newArray = s.toCharArray();

    Set<Character> uniquUsers = new HashSet<Character>();

    for (int i = 0; i < newArray.length; i++) {
        if (!uniquUsers.add(newArray[i]))
            newArray[i] =' '; 
    }
    System.out.println(new String(newArray));
}

Problem with this is when I try to remove the duplicate I replace it with a space. I tried replacing the duplicate with '' but it cannot be done or I cant set the duplicate place to null. What is the best way to do this?

If you use regex, you only need one line!

public void unique(String s) {
    System.out.println(s.replaceAll("(.)(?=.*\\1)", ""));
}

This removes (by replacing with blank) all characters that found again later in the input (by using a look ahead with a back reference to the captured character).

If I understand your question correctly, perhaps you could try something like:

public static String unique(final String string){
    final StringBuilder builder = new StringBuilder();
    for(final char c : string.toCharArray())
        if(builder.indexOf(Character.toString(c)) == -1)
            builder.append(c);
    return builder.toString();
}

This will work for what you are attempting.

public static void unique(String s) {
    // r code here
    char[] newArray = s.toCharArray();

    Set<Character> uniqueUsers = new HashSet<>();

    for (int i = 0; i < newArray.length; i++) {
        uniqueUsers.add(newArray[i]);
    }
    newArray = new char[uniqueUsers.size()];
    Iterator iterator = uniqueUsers.iterator();

    int i = 0;
    while (iterator.hasNext()) {
        newArray[i] = (char)iterator.next();
        i++;
    }

    System.out.println(new String(newArray));
}

You can use BitSet

public String removeDuplicateChar(String str){
         if(str==null || str.equals(""))throw new NullPointerException();
         BitSet b = new BitSet(256);
         for(int i=0;i<str.length();i++){
                  b.set(str.charAt(i));
         }
         StringBuilder s = new StringBuilder();
         for(int i=0;i<256;i++){
                  if(b.isSet(i)){
                           s.append((char)i);
                  }
         }
         return s.toString();
}

You can roll down your own BitSet like below:

 class BitSet {
    int[] numbers;
    BitSet(int k){
        numbers = new int[(k >> 5) + 1];
    }
    boolean isSet(int k){
        int remender = k & 0x1F;
        int devide = k >> 5;
        return ((numbers[devide] & (1 << remender)) == 1);
    }
    void set(int k){
        int remender = k & 0x1F;
        int devide = k >> 5;
        numbers[devide] = numbers[devide] | (1 << remender);
    }
}

without changing almost anything in your code, change the line

System.out.println(new String(newArray));

for

System.out.println( new String(newArray).replaceAll(" ", ""));

the addition of replaceAll will remove blanks

import java.util.*;

class StrDup{

    public static void main(String[] args){

        String s = "abcdabacdabbbabbbaaaaaaaaaaaaaaaaaaabbbbbbbbbbdddddddddcccccc";
        String dup = removeDupl(s);

    }

    public static String removeDupl(String s){
  
        StringBuilder sb = new StringBuilder(s);
        String ch = "";

        for(int i = 0; i < sb.length(); i++){
            ch = sb.substring(i,i+1);
            int j = i+1;
            int k = 0;

            while(sb.indexOf(ch,j)!=-1){
                k = sb.indexOf(ch,j);
                sb.deleleCharAt(k);
                j = k;
            }
        }

        return sb.toString();
    }
}

In the code above, I'm doing the following tasks.

  1. I'm first converting the string to a StringBuilder . Strings in Java are immutable, which means they are like CDs. You can't do anything with them once they are created. The only thing they are vulnerable to is their departure, ie the end of their life cycle by the garbage collector, but that's a whole different thing. Foe example:

     String s = "Tanish"; s + "is a good boy";

    This will do nothing. String s is still Tanish . To make the second line of code happen, you will have to assign the operation to some variable, like this:

    s = s + "is a good boy";

    And, make no mistake! I said strings are immutable, and here I am reassigning s with some new string. But, it's a NEW string. The original string Tanish is still there, somewhere in the pool of strings. Think of it like this: the string that you are creating is immutable. Tanish is immutable, but s is a reference variable. It can refer to anything in the course of its life. So, Tanish and Tanish is a good boy are 2 separate strings, but s now refers to the latter, instead of the former.

  2. StringBuilder is another way of creating strings in Java, and they are mutable. You can change them. So, if Tanish is a StringBuilder , it is vulnerable to every kind of operation (append, insert, delete, etc.).

  3. Now we have the StringBuilder sb , which is same as the String s .

  4. I've used a StringBuilder built-in method, ie indexOf() . This methods finds the index of the character I'm looking for. Once I have the index, I delete the character at that index.

    Remember, StringBuilder is mutable. And that's the reason I can delete the characters.

  5. indexOf is overloaded to accept 2 arguments ( sb.indexOf(substr ,index) ). This returns you the position of the first occurrence of string within the sb , starting from index.

    In the example string, sb.indexOf(a,1) will give me 4 . All I'm trying to say to Java is, "Return me the index of 'a', but start looking for 'a' from index 1'. So, this way I've the very first a at 0, which I don't want to get rid of.

  6. Now all I'm doing inside the for loop is extracting the character at i th position. j represents the position from where to start looking for the extracted character. This is important, so that we don't loose the one character we need. K represents the result of indexOf('a',j) , ie the first occurrence of a , after index j .

  7. That's pretty much it. Now, as long as we have a character ch lying in the string ( indexOf(....) returns -1, if it can't find the specified character (...or the string as i specified before) as a duplicate, we will obtain it's position ( k ), delete it using deleteCharAt(k) and update j to k . ie, the next duplicate a (if it exists) will appear after k , where it was last found.

  8. DEMONSTRATION :

    In the example I took, let's say we want to get rid of duplicate c s. So, we will start looking for the first c after the very first c , ie index 3.

    sb.indexOf("c",3) will give us 7, where a c is lying. so, k = 7 . delete it, and then set j to k . Now, j = 7 . Basically after deleting the character, the succeeding string shifts to left by 1. So, now at 7th pos we have d , which was at 8 before. Now, k = indexOf("c",7) and repeat the entire cycle. Also, remember that indexOf("c",j) will start looking right from j . which means if c , is found at j , it will return j . That's why when we extracted the first character, we started looking from position 1 after the character's position.

public class Duplicates {

public static void main(String[] args) {

String str="aabbccddeeff";

 String[] str1 = str.split("");

ArrayList List = new ArrayList

Arrays.asList(str1);

List newStr = List.stream().distinct().collect(Collectors.toList());

System.out.print(newStr);

}

}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM