简体   繁体   English

用空字符串替换所有非字母数字字符

[英]Replacing all non-alphanumeric characters with empty strings

I tried using this but didn't work-我试过用这个但没用-

return value.replaceAll("/[^A-Za-z0-9 ]/", "");

Use [^A-Za-z0-9] .使用[^A-Za-z0-9]

Note: removed the space since that is not typically considered alphanumeric.注意:删除了空格,因为它通常不被视为字母数字。

Try尝试

return value.replaceAll("[^A-Za-z0-9]", "");

or要么

return value.replaceAll("[\\W]|_", "");

You should be aware that [^a-zA-Z] will replace characters not being itself in the character range AZ/az.您应该知道[^a-zA-Z]将替换不在字符范围 AZ/az 中的字符。 That means special characters like é , ß etc. or cyrillic characters and such will be removed.这意味着像éß等特殊字符或西里尔字符等将被删除。

If the replacement of these characters is not wanted use pre-defined character classes instead:如果不想替换这些字符,请改用预定义的字符类:

 str.replaceAll("[^\\p{IsAlphabetic}\\p{IsDigit}]", "");

PS: \\p{Alnum} does not achieve this effect, it acts the same as [A-Za-z0-9] . PS: \\p{Alnum}没有达到这个效果,它的作用与[A-Za-z0-9]

return value.replaceAll("[^A-Za-z0-9 ]", "");

This will leave spaces intact.将使空间不变。 I assume that's what you want.我想这就是你想要的。 Otherwise, remove the space from the regex.否则,从正则表达式中删除空格。

你也可以试试这个更简单的正则表达式:

 str = str.replaceAll("\\P{Alnum}", "");

Java 的正则表达式不需要您在正则表达式周围放置正斜杠 ( / ) 或任何其他分隔符,这与 Perl 等其他语言相反。

Solution:解决方案:

value.replaceAll("[^A-Za-z0-9]", "")

Explanation:解释:

[^abc] When a caret ^ appears as the first character inside square brackets, it negates the pattern. [^abc]当插入符号^作为方括号内的第一个字符出现时,它否定该模式。 This pattern matches any character except a or b or c.此模式匹配除 a 或 b 或 c 之外的任何字符。

Looking at the keyword as two function:将关键字视为两个函数:

  • [(Pattern)] = match(Pattern)
  • [^(Pattern)] = notMatch(Pattern)

Moreover regarding a pattern:此外,关于模式:

  • AZ = all characters included from A to Z

  • az = all characters included from a to z

  • 0=9 = all characters included from 0 to 9

Therefore it will substitute all the char NOT included in the pattern因此它将替换模式中不包含的所有字符

I made this method for creating filenames:我用这个方法来创建文件名:

public static String safeChar(String input)
{
    char[] allowed = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ-_".toCharArray();
    char[] charArray = input.toString().toCharArray();
    StringBuilder result = new StringBuilder();
    for (char c : charArray)
    {
        for (char a : allowed)
        {
            if(c==a) result.append(a);
        }
    }
    return result.toString();
}

If you want to also allow alphanumeric characters which don't belong to the ascii characters set, like for instance german umlaut's, you can consider using the following solution:如果您还想允许不属于 ascii 字符集的字母数字字符,例如德国元音变音,您可以考虑使用以下解决方案:

 String value = "your value";

 // this could be placed as a static final constant, so the compiling is only done once
 Pattern pattern = Pattern.compile("[^\\w]", Pattern.UNICODE_CHARACTER_CLASS);

 value = pattern.matcher(value).replaceAll("");

Please note that the usage of the UNICODE_CHARACTER_CLASS flag could have an impose on performance penalty (see javadoc of this flag)请注意,使用 UNICODE_CHARACTER_CLASS 标志可能会对性能造成影响(请参阅此标志的 javadoc)

Using Guava you can easily combine different type of criteria.使用 Guava,您可以轻松组合不同类型的标准。 For your specific solution you can use:对于您的特定解决方案,您可以使用:

value = CharMatcher.inRange('0', '9')
        .or(CharMatcher.inRange('a', 'z')
        .or(CharMatcher.inRange('A', 'Z'))).retainFrom(value)

Simple method:简单方法:

public boolean isBlank(String value) {
    return (value == null || value.equals("") || value.equals("null") || value.trim().equals(""));
}

public String normalizeOnlyLettersNumbers(String str) {
    if (!isBlank(str)) {
        return str.replaceAll("[^\\p{L}\\p{Nd}]+", "");
    } else {
        return "";
    }
}
public static void main(String[] args) {
    String value = " Chlamydia_spp. IgG, IgM & IgA Abs (8006) ";

    System.out.println(value.replaceAll("[^A-Za-z0-9]", ""));

}

output: ChlamydiasppIgGIgMIgAAbs8006输出:衣原体IgGIgMIgAAbs8006

Github: https://github.com/AlbinViju/Learning/blob/master/StripNonAlphaNumericFromString.java Github: https : //github.com/AlbinViju/Learning/blob/master/StripNonAlphaNumericFromString.java

Guava 的CharMatcher提供了一个简洁的解决方案:

output = CharMatcher.javaLetterOrDigit().retainFrom(input);

Dart Dart

If you tried this and it didn't work..如果你试过这个但没有用..

value.replaceAll("[^A-Za-z0-9]", ""); value.replaceAll("[^A-Za-z0-9]", "");

Just use RegExp like this:只需像这样使用 RegExp:

value.replaceAll(RegExp("[^A-Za-z0-9]"), ""); value.replaceAll(RegExp("[^A-Za-z0-9]"), "");

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 用空字符串替换所有非字母数字+标点字符 - Replacing all non-alphanumeric + punctuation characters with empty strings 替换除某些字符外的所有非字母数字字符 - Replacing all non-alphanumeric characters except some characters 通过忽略(不替换)非字母数字字符或查看第一个字母数字字符对字符串列表进行排序 - Sorting a list of strings by ignoring (not replacing) non-alphanumeric characters, or by looking at the first alphanumeric character 使用正则表达式删除$以外的所有非字母数字字符 - Regular expression to remove all non-alphanumeric characters except $ 正则表达式要删除所有具有通用语言支持的非字母数字字符吗? - Regex to remove all non-Alphanumeric characters with universal language support? 正则表达式匹配ASCII非字母数字字符 - Regex to match ASCII non-alphanumeric characters 如何删除任何非字母数字字符? - How to remove any non-alphanumeric characters? 如何从字符串中删除所有非字母数字字符(Java中的小数点除外) - How to remove all non-alphanumeric characters from a string expect decimal point in Java 删除所有非字母数字字符但允许多字词 - remove all non-alphanumeric characters but allow multi-word terms 没有运行非字母数字字符的行的正则表达式 - Regex for line without runs of non-alphanumeric characters
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM