简体   繁体   中英

Java non-greedy (?) regex to match string

String poolId = "something/something-else/pools[name='test'][scope='lan1']";
String statId = "something/something-else/pools[name='test'][scope='lan1']/stats[base-string='10.10.10.10']";

Pattern pattern = Pattern.compile(".+pools\\[name='.+'\\]\\[scope='.+'\\]$");

What regular expression should be used such that

pattern.matcher(poolId).matches()

returns true whereas

pattern.matcher(statsId).matches()

returns false ?

Note that

  1. something/something-else is irrelevant and can be of any length
  2. Both name and scope can have ANY character including any of \\, /, [, ] etc
  3. stats[base-string='10.10.10.10'] is an example and there can be anything else after /

I tried to use the non-greedy ? like so .+pools\\\\[name='.+'\\\\]\\\\[scope='.+?'\\\\]$ but still both matches return true

You can use

.+pools\[name='[^']*'\]\[scope='[^']*'\]$

See the regex demo . Details :

  • .+ - any one or more chars other than line break chars as many as possible
  • pools\\[name=' - a pools[name=' string
  • [^']* - zero or more chars other than a '
  • '\\]\\[scope=' - a '][scope=' string
  • [^']* - zero or more chars other than a '
  • '\\] - a '] substring
  • $ - end of string.

In Java:

Pattern pattern = Pattern.compile(".+pools\\[name='[^']*']\\[scope='[^']*']$");

See the Java demo :

//String s = "something/something-else/pools[name='test'][scope='lan1']"; // => Matched!
String s = "something/something-else/pools[name='test'][scope='lan1']/stats[base-string='10.10.10.10']";
Pattern pattern = Pattern.compile(".+pools\\[name='[^']*']\\[scope='[^']*']$");
Matcher matcher = pattern.matcher(s);
if (matcher.find()){
    System.out.println("Matched!"); 
} else {
    System.out.println("Not Matched!"); 
}
// => Not Matched!

Wiktor assumed that your values for name and scope cannot have single quotes in them. Thus the following:

.../pools[name='tes't']

would not match. This is really the only valid assumption to make, as if you can include unescaped single quotes, then what's to stop the value of scope from being (for example) the literal value lan1']/stats[base-string='10.10.10.10 ? The regex you included in your question has this issue. If you simply must have these values in your code, you need to escape them somehow. Try the following (edit of Wiktor's regex):

.+pools\[name='([^']|\\')*'\]\[scope='([^']|\\')*'\]$ 

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM