简体   繁体   English

如何检查字符串是否为正则表达式

[英]How to check if the string is a regular expression or not

I have a string.我有一个字符串。 How I can check if the string is a regular expression or contains regular expression or it is a normal string?如何检查字符串是正则表达式还是包含正则表达式还是普通字符串?

The only reliable check you could do is if the String is a syntactically correct regular expression: 您可以做的唯一可靠检查是String是语法正确的正则表达式:

boolean isRegex;
try {
  Pattern.compile(input);
  isRegex = true;
} catch (PatternSyntaxException e) {
  isRegex = false;
}

Note, however, that this will result in true even for strings like Hello World and I'm not a regex , because technically they are valid regular expressions. 但请注意,即使对于像Hello World这样的字符串,这也会导致为trueI'm not a regex ,因为从技术上讲它们是有效的正则表达式。

The only cases where this will return false are strings that are not valid regular expressions, such as [unclosed character class or (unclosed group or + . 唯一会返回false是无效正则表达式的字符串,例如[unclosed character class or (unclosed group or +

This is ugly but will detect simple regular expressions (with the caveat they must be designed for Java ie have the relevant back-slash character escaping). 这很丑陋但会检测简单的正则表达式(需要注意的是它们必须是为Java设计的,即具有相关的反斜杠字符转义)。

public boolean isRegex(final String str) {
    try {
        java.util.regex.Pattern.compile(str);
        return true;
    } catch (java.util.regex.PatternSyntaxException e) {
        return false;
    }
}

there is no difference between a 'normal' sting and a regular expression. “正常”刺痛和正则表达之间没有区别。 A regular expression is just a normal string which is used as a pattern to match occurrences of the pattern in another string. 正则表达式只是一个普通的字符串,用作匹配另一个字符串中模式的出现的模式。

As others have pointed out, it is possible that the string might not be a valid regular expression, but I think that is the only check you can do. 正如其他人所指出的那样,字符串可能不是有效的正则表达式,但我认为这是您可以做的唯一检查。 If it is valid then there is no way to know if it is a regular expression or just a normal string because it will be a regular expression 如果它是有效的,则无法知道它是正则表达式还是普通字符串,因为它将是正则表达式

It is just a normal string which is interpreted in a specific way by the regex engine. 它只是一个普通字符串,由正则表达式引擎以特定方式解释。

for example "blah" is a regular expression which will only match the string "blah" where ever it occurs in another string. 例如,“blah”是一个正则表达式,它只匹配字符串“blah”,它出现在另一个字符串中。

When looked at this way, you can see that a regular expression does not need to contain any of the 'special characters' that do more advanced pattern matching, and it will only match the string in the pattern 当以这种方式查看时,您可以看到正则表达式不需要包含任何执行更高级模式匹配的“特殊字符”,并且它只匹配模式中的字符串

Maybe you'd try to compile that regular expression using regexp package from Apache ( http://jakarta.apache.org/regexp/ ) and, if you get an exception then that's not a valid regexp so you'd say it's a normal string. 也许你会尝试使用Apache的regexp包( http://jakarta.apache.org/regexp/ )来编译那个正则表达式,如果你得到一个例外,那么这不是一个有效的正则表达式,所以你会说这是正常的串。

boolean validRE = true;
try {
    RE re = new RE(stringToCheck);
} catch (RESyntaxException e) {
    validRE = false;
}

Obviously, the user would have typed an invalid regexp and you'd be handling it as a normal string. 显然,用户会键入一个无效的正则表达式,并且您将其作为普通字符串处理。

If anyone just want to distinguish just plain text strings and regular-expressions:如果有人只想区分纯文本字符串和正则表达式:

static boolean hasSpecialRegexCharacters(String s){
    Pattern regexSpecialCharacters = Pattern
            .compile("[\\\\\\.\\[\\]\\{\\}\\(\\)\\<\\>\\*\\+\\-\\=\\!\\?
      \\^\\$\\|]");
     return regexSpecialCharacters.matcher(s).find();
}
/**
 * If input string is a regex, matches will always return a false.
 */ 
public boolean isRegex(final String str) {   
    return str != null ? !str.matches(str) : false;
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM