简体   繁体   中英

Java Regex,Empty String be matched

I wrote a regex in Java to match sentences containing some string like this:

String regex = "((^|([.!?:] ))" + "[^!.?:]*?" + queryStr + ".*?" + "([!.?])|$)+?";

Then I use the regex to match my String, see below:

Pattern pattern = Pattern.compile(regex);
String content = "Hello World!!!";
Matcher match = pattern.matcher(content);
int index = 0;
while(match.find(index))
{
   index = match.end() -1;
   System.out.println(match.group());
}

But the loop never ends, I suspect, because the regex matches empty string. Apparently, my regex includes the String queryStr. So, I am confused with this. Can anyone help me to slove this?

Your regex pattern looks like

((^|([.!?:] ))[^!.?:]*?Hello.*?([!.?])|$)+?

It contains 2 alternatives:

  1. (^|([.!?:] ))[^!.?:]*?Hello.*?([!.?])
  2. $

So, the problem was that you were matching the end of string all the time in a loop.

Make this change:

String regex = "(^|[.!?:] )" + "[^!.?:]*?" + queryStr + ".*?" + "([!.?]+?|$)";

Now, it will look like

(^|[.!?:] )[^!.?:]*?Hello.*?([!.?]+?|$)

And $ will be an alternative to [!.?]+? only.

在此处输入图片说明

See demo on regex101.com .

Every term of your regex is optional.

To prevent matching blank input, add this to the front of your regex:

(?!$)

This is a look ahead that asserts the current position is not followed by end of input (ie "something" is following)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM