简体   繁体   English

逃脱(正则表达式

[英]Escape ( in regular expression

Im searching for the regular expression - ". (conflicted copy. ". I wrote the following code for this 我正在搜索正则表达式 - “。 (冲突副本。 ”。我为此编写了以下代码

String str = "12B - (conflicted copy 2013-11-16-11-07-12)";
boolean matches = str.matches(".*(conflicted.*");
System.out.println(matches);

But I get the exception 但我得到了例外

Exception in thread "main" java.util.regex.PatternSyntaxException: Unclosed group near index 15 . 线程“main”中的异常java.util.regex.PatternSyntaxException:索引15附近的未闭合组。 (conflicted. (冲突。

I understand that the compiler thinks that ( is the beginning of a pattern group. I tried to escape ( by adding \\( but that doesnt work. 我理解编译器认为(是模式组的开头。我试图逃避(通过添加\\(但这不起作用。

Can someone tell me how to escape ( here ? 谁能告诉我如何逃避(这里?

Escaping is done by \\ . 转义是由\\完成的。 In Java, \\ is written as \\\\ 1 , so you should escaping the ( would be \\\\( . 在Java中, \\写为\\\\ 1 ,所以你应该转义(将是\\\\(

Side note: It's good to have a look at Pattern#quote that returns a literal pattern String . 旁注:最好看一下Pattern#quote ,它返回一个文字模式String In your case, it's not that helpful since you don't want to escape all special-characters . 在你的情况下,它没有那么有用,因为你不想逃避所有特殊字符


1 Because a character preceded by a backslash ( \\ ) is an escape sequence and has special meaning to the compiler . 1因为前面带有反斜杠( \\ )的字符是转义序列,对编译器具有特殊含义

( in regex is metacharacter which means "start of group" and it needs to be closed with ) . (正则表达式是元字符,意思是“组的开始”,需要关闭) If you want refex engine to tread it as simple literal you need to escape it. 如果你想让refex引擎以简单的文字形式进行操作,你就需要逃避它。 You can do it by adding \\ before it, but since \\ is also metacharacter in String (used for example to create characters like "\\n" , "\\t" ) you need to escape it as well which will look like "\\\\" . 您可以通过在它之前添加\\来实现,但由于\\也是字符串中的元字符(例如用于创建"\\n""\\t"类的字符),您还需要将其转义为"\\\\" So try 所以试试吧

str.matches(".*\\(conflicted.*"); 

Other option is to use character class to escape ( like 其他选项是使用字符类来逃避(比如

str.matches(".*[(]conflicted.*"); 

You can also use Pattern.quote() on part that needs to be escaped like 您还可以在需要转义的部分使用Pattern.quote()

str.matches(".*"+Pattern.quote("(")+"conflicted.*"); 

Or simply surround part in which all characters should be threaded as literals with "\\\\Q" and "\\\\E" which represents start and end of quotation. 或者简单地将所有字符应该作为文字的部分包围,其中"\\\\Q""\\\\E"表示引用的开始和结束。

str.matches(".*\\Q(\\Econflicted.*"); 

In Regular Expressions all characters can be safely escaped by adding a backslash in front. 在正则表达式中,可以通过在前面添加反斜杠来安全地转义所有字符。

Keep in mind that in most languages, including C#, PHP and Java, the backslash itself is also a native escape, and thus needs to be escaped itself in non-literal strings, so requiring you to enter "myText \\\\(" . 请记住,在大多数语言中,包括C#,PHP和Java,反斜杠本身也是本机转义,因此需要在非文字字符串中进行转义,因此要求您输入"myText \\\\("

Using a backslash inside a regular expression may require you to escape it both on the language level and the regex level ( "\\\\\\\\" ): this passes "\\\\" to the regex engine, which parses it as "\\" itself. 在正则表达式中使用反斜杠可能需要您在语言级别和正则表达式级别( "\\\\\\\\" )上对其进行转义:这会将"\\\\"传递给正则表达式引擎,该引擎将其解析为"\\"本身。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM