简体   繁体   English

Java正则表达式-转义所有特殊的正则表达式字符

[英]Java Regex - escape all special regex characters

I do have a lot of strings with special regex characters. 我确实有很多带有特殊正则表达式字符的字符串。 An examples: 一个例子:

- Test1 + Test2 -> plus should not be a regex special character but a normal character

Is there a Java regex method to escape all regex special characters? 是否存在Java regex方法来转义所有regex特殊字符?

The comments Advice to better escape it manually is generally right - but if your input String is "unknown", for instance a string the user can enter, you can't do this. 通常建议使用“建议更好地手动转义”的建议-但是,如果您输入的字符串为“未知”,例如用户可以输入的字符串,则无法执行此操作。 So, assuming, your string is a variable, you are most likely looking for Pattern.quote : https://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html#quote(java.lang.String) 因此,假设您的字符串是一个变量,您很可能会寻找Pattern.quotehttps : //docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html#quote( java.lang.String中)

String variableFromSomewhere="- Test1 + Test2";
String escapedString = Pattern.quote(variableFromSomewhere);

(This does nothing else than using \\Q and \\E , but requires less effort while typing and avoids problems if \\Q and \\E is a part of the input string itself.) (除了使用\\Q\\E ,这没有什么其他\\E ,但是在键入时所需的精力更少,并且如果\\Q\\E是输入字符串本身的一部分,则可以避免出现问题。)

This is how to escape all regex metacharacters. 这是如何转义所有正则表达式元字符的方法。

Background: 背景:

  1. If you need to convert some input or dynamic source into a regex, it is assured 如果您需要将某些输入或动态源转换为正则表达式,则可以放心
    that it is %100 a literal. 它是%100个文字。

  2. \\Q .. \\E is used if you have a MIX of regex constructs and literals that use regex constructs via metacharacters. 如果您有MIX的regex构造通过元字符使用regex构造的文字,则使用\\Q .. \\E

    Example: (?:\\Q(?:dogs|cats)*\\E)+ 例如: (?:\\Q(?:dogs|cats)*\\E)+
    This will match one or more literal (?:dogs|cats)* 这将匹配一个或多个文字(?:dogs|cats)*

There are other issues with \\Q .. \\E like nesting and interpreting the \\Q .. \\E还有其他问题,例如嵌套和解释
final resultant escaped regex. 最终结果逃脱了正则表达式。 It becomes very difficult to debug. 调试变得非常困难。

So, the easiest and safest thing is to just use String.replaceAll() . 因此,最简单,最安全的方法就是只使用String.replaceAll()

Java sample: Java示例:

 String src = "he,<>!!llo \\ + * ? [ ] ( ) { } | . ^ $ wo-r@l#d";
 System.out.println( src );
 src = src.replaceAll("([\\\\+*?\\[\\](){}|.^$])", "\\\\$1");
 System.out.println( src );

Output: 输出:

he,<>!!llo \ + * ? [ ] ( ) { } | . ^ $ wo-r@l#d
he,<>!!llo \\ \+ \* \? \[ \] \( \) \{ \} \| \. \^ \$ wo-r@l#d

要转义单个特殊字符,可以使用\\\\

boolean b = Pattern.matches("\\- .* \\+ .*",  "- Test + Test"); // true

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM