简体   繁体   中英

regular expressions: quantifying a non-capturing group

See here for some background on what I'm trying to do. In short, I want to match any paths under a /path/foo/ , unless the leaf directory (not the leaf file ), is script .

There's some answers in that quesiton that seem to work, but I'm trying to figure out why a certain solution I attempted did NOT work. The regex is this:

^/path/foo(?:/[^/]+)*(?!/script)/[^/]*$

My admittedly limited understanding of this is the following:

  1. the literal string /path/foo
  2. any number of occurrences of the submatch /[^/]+ . Basically, 0 or more repeated patterns of / followed by some directory name (I'm aware of the issues with spaces or special characters in file paths. I'm ignoring that for now)
  3. NOT the literal string /script . So if, after however many repeated folders from (2), the next thing is /script , it fails, assuming it is then followed by...
  4. a literal /
  5. 1 or more non- / characters, followed by the end of the string.

However, this doesn't work . It seems to match everything that starts with /path/foo .

What's wrong with this regex?

Consider input:

/path/foo/a/b/script/file

Regex matches as follows:

^                 Ok: No text before here
/path/foo         "/path/foo"
(?:/[^/]+)*       "/a/b/script"
(?!/script)       Ok: Text after here is "/file"
/                 "/"
[^/]*             "file"
$                 Ok: No text after here

What you wanted is a negative lookbehind, not a negative lookahead:

^                 Ok: No text before here
/path/foo         "/path/foo"
(?:/[^/]+)*       "/a/b/script"
(?<!/script)      Fail: Text before here is "/script"
/                 "/"
[^/]*             "file"
$                 Ok: No text after here

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM