简体   繁体   中英

How to find if a Java String contains X or Y and contains Z

I'm pretty sure regular expressions are the way to go, but my head hurts whenever I try to work out the specific regular expression.

What regular expression do I need to find if a Java String (contains the text "ERROR" or the text "WARNING") AND (contains the text "parsing"), where all matches are case-insensitive?

EDIT: I've presented a specific case, but my problem is more general. There may be other clauses, but they all involve matching a specific word, ignoring case. There may be 1, 2, 3 or more clauses.

If you're not 100% comfortable with regular expressions, don't try to use them for something like this. Just do this instead:

string s = test_string.toLowerCase();
if (s.contains("parsing") && (s.contains("error") || s.contains("warning")) {
    ....

because when you come back to your code in six months time you'll understand it at a glance.

Edit: Here's a regular expression to do it:

(?i)(?=.*parsing)(.*(error|warning).*)

but it's rather inefficient. For cases where you have an OR condition, a hybrid approach where you search for several simple regular expressions and combine the results programmatically with Java is usually best, both in terms of readability and efficiency.

If you really want to use regular expressions, you can use the positive lookahead operator:

(?i)(?=.*?(?:ERROR|WARNING))(?=.*?parsing).*

Examples:

Pattern p = Pattern.compile("(?=.*?(?:ERROR|WARNING))(?=.*?parsing).*", Pattern.CASE_INSENSITIVE); // you can also use (?i) at the beginning
System.out.println(p.matcher("WARNING at line X doing parsing of Y").matches()); // true
System.out.println(p.matcher("An error at line X doing parsing of Y").matches()); // true
System.out.println(p.matcher("ERROR Hello parsing world").matches()); // true       
System.out.println(p.matcher("A problem at line X doing parsing of Y").matches()); // false

With multiple .* constucts the parser will invoke thousands of "back off and retry" trial matches.

Never use .* at the beginning or in the middle of a RegEx pattern.

尝试:

 If((str.indexOf("WARNING") > -1 || str.indexOf("ERROR") > -1) && str.indexOf("parsin") > -1)

Regular Expressions are not needed here. Try this:

if((string1.toUpperCase().indexOf("ERROR",0) >= 0 ||  
  string1.toUpperCase().indexOf("WARNING",0) >= 0 ) &&
  string1.toUpperCase().indexOf("PARSING",0) >= 0 )

This also takes care of the case-insensitive criteria

I usually use this applet to experiment with reg. ex. The expression may look like this:

if (str.matches("(?i)^.*?(WARNING|ERROR).*?parsing.*$")) {
...

But as stated in above answers it's better to not use reg. ex. here.

我认为这个正则表达式可以解决问题(但必须有更好的方法):

(.*(ERROR|WARNING).*parsing)|(.*parsing.*(ERROR|WARNING))

If you've a variable number of words that you want to match I would do something like that:

String mystring = "Text I want to match";
String[] matchings = {"warning", "error", "parse", ....}
int matches = 0;
for (int i = 0; i < matchings.length(); i++) {
  if (mystring.contains(matchings[i]) {
    matches++;
  }
}

if (matches == matchings.length) {
   System.out.println("All Matches found");
} else {
   System.out.println("Some word is not matching :(");
}

Note: I haven't compiled this code, so could contain typos.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM