简体   繁体   English

如何在Java中使用特殊字符`}`、`/`、`-`和`{`分割字符串

[英]How to split a string with special characters `}`, `/`, `-` and `{` in Java

I had been following the thread How to split a string in Java and had been successful.我一直在关注如何在 Java 中拆分字符串的线程并且已经成功。

But in the current usecase the String I am dealing with contains the special characters.但在当前用例中,我正在处理的String包含特殊字符。

I am having a String as https://{domain name}/{type of data}/4583236-{name-of-perpetrators} and I want to extract 4583236 out of it.我有一个String https://{domain name}/{type of data}/4583236-{name-of-perpetrators}我想从中提取4583236

The QA How to split the string using '^' this special character in java? QA 如何在java中使用'^'这个特殊字符来分割字符串? is more or less related to the Question I already have mentioned previously but doesn't helps in my usecase.或多或少与我之前已经提到的问题有关,但对我的用例没有帮助。

My program is throwing PatternSyntaxException: Illegal repetition randomly on either of the special characters.我的程序抛出PatternSyntaxException: Illegal repetition对任一特殊字符随机PatternSyntaxException: Illegal repetition

Code Block :代码块:

    String current_url = "https://{domain name}/{type of data}/4583236-{name-of-perpetrators}";
    String[] urlParts = current_url.split("type of data}/");
    String mySuburl = urlParts[1];
    String[] suburl = mySuburl.split("-{name-of-perpetrators");
    String mytext = suburl[0];
    System.out.println(mytext);

Error Stack Trace :错误堆栈跟踪:

Exception in thread "main" java.util.regex.PatternSyntaxException: Illegal repetition
{name-of-perpetrators
    at java.util.regex.Pattern.error(Unknown Source)
    at java.util.regex.Pattern.closure(Unknown Source)
    at java.util.regex.Pattern.sequence(Unknown Source)
    at java.util.regex.Pattern.expr(Unknown Source)
    at java.util.regex.Pattern.compile(Unknown Source)
    at java.util.regex.Pattern.<init>(Unknown Source)
    at java.util.regex.Pattern.compile(Unknown Source)
    at java.lang.String.split(Unknown Source)
    at java.lang.String.split(Unknown Source)
    at demo.TextSplit.main(TextSplit.java:18)

尝试使用Pattern.quote避免逐个字符转义,它会为您免费提供:

String[] suburl = mySuburl.split(Pattern.quote("-{name-of-perpetrators"));

The argument for split is a regex. split的参数是一个正则表达式。 So, you need to escape the special characters used in regex like { .因此,您需要转义正则表达式中使用的特殊字符,例如{ {} is used to denote repetition in regex and hence the error Illegal repetition . {}用于表示正则表达式中的重复,因此出现错误Illegal repetition

String[] suburl = mySuburl.split("-\\{name-of-perpetrators");

If you don't want the argument for split to be a regex, use Pattern.quote to avoid escaping as @YCF_L suggested.如果您不希望split的参数是正则表达式,请使用Pattern.quote以避免像@YCF_L建议的那样转义

String[] suburl = mySuburl.split(Pattern.quote("-{name-of-perpetrators"));

There is literally no reason to use something as complex as regular expression patterns for something as simple as finding literal string contained in another string.对于像查找包含在另一个字符串中的文字字符串这样简单的事情,实际上没有理由使用像正则表达式模式这样复杂的东西。

Using indexOf and substring is sufficient:使用indexOfsubstring就足够了:

String text = "https://{domain name}/{type of data}/4583236-{name-of-perpetrators}";
String searchStart = "{type of data}/";
String searchEnd = "-{name-of-perpetrators}";
int start = text.indexOf(searchStart) + searchStart.length();
int end = text.indexOf(searchEnd, start);

String expected = "4583236";
assertEquals(expected, text.substring(start, end));

Obviously, if at any point input text might not have exactly this format, then this approach might fail, for example by making start or end variables negative.显然,如果在任何时候输入文本可能不完全是这种格式,那么这种方法可能会失败,例如通过将startend变量设为负数。 If that is the case, you should check for it and handle it appropriately.如果是这种情况,您应该检查它并适当处理它。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM