简体   繁体   中英

Removing spaces, numbers and special characters from a string

I am writing a function to remove spaces from a String passed as an argument.

This code works:

public static String removeSpecialChars(String str) {
    String finalstr = "";
    char[] arr = str.toCharArray();
    char ch;
    for (int i = 0; i < arr.length; i++) {
        ch = arr[i];
        if (Character.isLetter(ch))
            finalstr = finalstr.concat(String.valueOf(ch));
        else
            continue;
    }
    return finalstr;
}

And the output for the String 'hello world!'is as follows:

helloworld

But this one doesn't:

public static String removeSpecialChars(String str) {
    char[] arr = str.toCharArray();
    char[] arr2 = new char[str.length()];
    char ch;
    for (int i = 0; i < arr.length; i++) {
        ch = arr[i];
        if (Character.isLetter(ch))
            arr2[i] = ch;
    }
    return String.valueOf(arr2);
}

Output:

hello world

I get the same String back as an output, but only the exclamation mark is removed. What could be the reason for this? Any help would be appreciated.

A char value is just a numeric value in the range 0 to 2¹⁶−1. In hexadecimal (base 16), we write that as 0000 to ffff.

So, knowing that each char array is a sequence of numeric values, let's look at the state of each array as your program proceeds. (I'm showing each value as two hex digits, rather than four, for brevity, since they are all in the range 00–ff.)

char [] arr = str.toCharArray();

// [ 68 65 6c 6c 6f 20 77 6f 72 6c 64 21 ]
// (UTF-16 values for the characters in "hello world!")

char [] arr2 = new char[str.length()];

// [ 00 00 00 00 00 00 00 00 00 00 00 00 ]
// (uninitialized arrays are always initialized with zeroes)

char ch;
for (int i = 0; i < arr.length; i++) {
    ch = arr[i];
    if (Character.isLetter(ch))
        arr2[i] = ch;
}

// arr2 after first loop iteration:
// [ 68 00 00 00 00 00 00 00 00 00 00 00 ]

// arr2 after second loop iteration:
// [ 68 65 00 00 00 00 00 00 00 00 00 00 ]

// arr2 after third loop iteration:
// [ 68 65 6c 00 00 00 00 00 00 00 00 00 ]

// arr2 after fourth loop iteration:
// [ 68 65 6c 6c 00 00 00 00 00 00 00 00 ]

// arr2 after fifth loop iteration:
// [ 68 65 6c 6c 6f 00 00 00 00 00 00 00 ]

// During sixth loop iteration,
// the if-condition is not met, so arr2[6]
// is never changed at all!
// [ 68 65 6c 6c 6f 00 00 00 00 00 00 00 ]

// arr2 after seventh loop iteration:
// [ 68 65 6c 6c 6f 00 77 00 00 00 00 00 ]

// During twelfth and final loop iteration,
// the if-condition is not met, so arr2[11]
// is never changed at all!
// [ 68 65 6c 6c 6f 00 77 6f 72 6c 64 00 ]

I don't know how you're examining the returned string, but here is what's actually in it:

"hello\u0000world\u0000"

As Johnny Mopp pointed out, since you want to skip some characters, you need to use two index variables, and when you create the String at the end, you need to use that second index variable to limit how many characters you use to create the string.

Since Java 9 you can use codePoints method:

public static void main(String[] args) {
    System.out.println(removeSpecialChars("hello world!")); // helloworld
    System.out.println(removeSpecialChars("^&*abc123_+"));  // abc
    System.out.println(removeSpecialChars("STRING"));       // STRING
    System.out.println(removeSpecialChars("Слово_Йй+ёЁ"));  // СловоЙйёЁ
}
public static String removeSpecialChars(String str) {
    return str.codePoints()
            // Stream<Character>
            .mapToObj(ch -> (char) ch)
            // filter out non-alphabetic characters
            .filter(Character::isAlphabetic)
            // Stream<String>
            .map(String::valueOf)
            // concatenate into a single string
            .collect(Collectors.joining());
}

See also: How do I count the parentheses in a string?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM