简体   繁体   English

按字符数组拆分字符串

[英]Split string by array of characters

i want to split a string by array of characters, so i have this code: 我想按字符数组拆分一个字符串,所以我有这个代码:

String target = "hello,any|body here?";
char[] delim = {'|',',',' '};
String regex = "(" + new String(delim).replaceAll("(.)", "\\\\$1|").replaceAll("\\|$", ")");
String[] result = target.split(regex);

everything works fine except when i want to add a character like 'Q' to delim[] array, it throws exception : 一切正常,除非我想在delim []数组中添加像'Q'这样的字符,它会抛出异常:

java.util.regex.PatternSyntaxException: Illegal/unsupported escape sequence near index 11
(\ |\,|\||\Q)

so how can i fix that to work with non-special characters as well? 那么如何解决这个问题呢?

thanks in advance 提前致谢

how can i fix that to work with non-special characters as well 我怎么能解决这个问题,以便与非特殊字符一起使用

Put square brackets around your characters, instead of escaping them. 在角色周围放置方括号,而不是转义它们。 Make sure that if ^ is included in your list of characters, you need to make sure it's not the first character, or escape it separately if it's the only character on the list. 确保如果^包含在您的字符列表中,您需要确保它不是第一个字符,或者如果它是列表中唯一的字符则单独转义它。

Dashes also need special treatment - they need to go at the beginning or at the end of the regex. 破折号还需要特殊处理 - 它们需要在正则表达式的开头或结尾处进行。

String delimStr = String(delim);
String regex;
if (delimStr.equals("^") {
    regex = "\\^"
} else if (delimStr.charAt(0) == '^') {
    // This assumes that all characters are distinct.
    // You may need a stricter check to make this work in general case.
    regex = "[" + delimStr.charAt(1) + delimStr + "]";
} else {
    regex = "[" + delimStr + "]";
}

Using Pattern.quote and putting it in square brackets seems to work: 使用Pattern.quote并将其放在方括号中似乎工作:

String regex = "[" + Pattern.quote(new String(delim)) + "]";

Tested with possible problem characters . 测试可能的问题字符

Q is not a control character in a regex, so you do not have to put the \\\\ before it (it only serves to mark that you must interpret the following character as a literal, and not as a control character). Q不是正则表达式中的控制字符,因此您不必在它之前放置\\\\ (它仅用于标记您必须将以下字符解释为文字,而不是控制字符)。

Example

`\\.` in a regex means "a dot"

`.` in a regex means "any character"

\\\\Q fails because Q is not special character in a regex, so it does not need to be quoted. \\\\Q失败,因为Q在正则表达式中不是特殊字符,因此不需要引用。

I would make delim a String array and add the quotes to these values that need it. 我会使delim成为一个String数组,并将引号添加到需要它的这些值。

 delim = {"\\|", ..... "Q"};

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM