简体   繁体   English

Java用特殊字符分割

[英]Java split with special characters

I have below code that doing a split for string using <div>\\\\$\\\\$PZ\\\\$\\\\$</div> and it's not working using the special characters.我有下面的代码使用<div>\\\\$\\\\$PZ\\\\$\\\\$</div>对字符串进行拆分,但使用特殊字符不起作用。

public class HelloWorld{

     public class HelloWorld{

     public static void main(String []args){
          String str = "test<div>\\$\\$PZ\\$\\$</div>test"; 
        String[] arrOfStr = str.split("<div>\\$\\$PZ\\$\\$</div>", 2); 
        for (String a : arrOfStr) 
            System.out.println(a);
     }
}

the output os test<div>\\$\\$PZ\\$\\$</div>test输出 os test<div>\\$\\$PZ\\$\\$</div>test

it works when I remove the special characters当我删除特殊字符时它起作用

Can you please help.你能帮忙吗。

As you already know, the parameter to split(...) is a regular expression , so some characters have special meaning.如您所知, split(...)的参数是一个正则表达式,因此某些字符具有特殊含义。 If you want the parameter to be treated literally , ie not as a regex, call the Pattern.quote(String s) method.如果您希望参数按字面处理,即不作为正则表达式,请调用Pattern.quote(String s)方法。

Example例子

String str = "test<div>\\$\\$PZ\\$\\$</div>test";
String[] arrOfStr = str.split(Pattern.quote("<div>\\$\\$PZ\\$\\$</div>"), 2);
for (String a : arrOfStr)
    System.out.println(a);

Output输出

test
test

The quote() method simply surrounds the literal text with the regex \\Q...\\E quotation pattern 1 , eg your <div>\\$\\$PZ\\$\\$</div> text becomes: quote()方法简单地用正则表达式\\Q...\\E 引用模式1包围文字文本,例如您的<div>\\$\\$PZ\\$\\$</div>文本变为:

\Q<div>\$\$PZ\$\$</div>\E

For fixed text you could just do that yourself, ie the following 3 versions all create the same regex to split on:对于固定文本,您可以自己做,即以下 3 个版本都创建相同的正则表达式来拆分:

str.split(Pattern.quote("<div>\\$\\$PZ\\$\\$</div>"), 2)

str.split("\\Q<div>\\$\\$PZ\\$\\$</div>\\E", 2)

str.split("<div>\\\\\\$\\\\\\$PZ\\\\\\$\\\\\\$</div>", 2)

To me, the 3rd one, using \\ to escape, is the least readable/desirable version.对我来说,使用\\转义的第三个版本是最不可读/最不受欢迎的版本。

If there is a lot of special characters to escape, using \\Q...\\E is easier than \\ -escaping all the special characters separately, but very few people use it, so it's fairly unknown to most.如果有很多的特殊字符逃脱,使用\\Q...\\E比更容易\\分别-escaping所有的特殊字符,但很少有人使用它,所以它是相当不知道的最多。

The quote() method is especially useful when you need to treat dynamic text literally, eg when the text to split on is configurable by the user.当您需要逐字处理动态文本时,例如当要拆分的文本可由用户配置时, quote()方法特别有用。

1) quote() will correctly handle literal text containing \\E . 1) quote()将正确处理包含\\E文字文本。

This:这个:

String str = "test<div>\\$\\$PZ\\$\\$</div>test";
String[] arrOfStr = str.split("<div>\\\\\\$\\\\\\$PZ\\\\\\$\\\\\\$</div>", 2);
for (String a : arrOfStr) {
    System.out.println(a);
}

prints:印刷:

test
test

EDIT : Why do we need all those backslashes?编辑:为什么我们需要所有这些反斜杠? It's because of how we need to handle String literals representing regex expressions.这是因为我们需要如何处理表示正则表达式的字符串文字。 This page describes the reason with examples.本页举例说明原因。 The essence is this:本质是这样的:

For a backslash \\ ...对于反斜杠\\ ...

...the pattern to match that would be \\\\ ... (to escape the escape) ...匹配的模式将是\\\\ ...(逃避转义)

... but the string literal to create that pattern would have to have one backslash to escape each of the two backslashes: \\\\\\\\ . ...但是创建该模式的字符串文字必须有一个反斜杠来转义两个反斜杠中的每一个: \\\\\\\\

Add to that the original need to also escape the $ , that gives us our 6 backslashes in the string representation.除此之外,还需要对$进行转义,这在字符串表示中为我们提供了 6 个反斜杠。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM