简体   繁体   中英

Java regex issue (negative lookahead & lookbehind)

I need your help guys,. This is a tricky java regex issue. been search for a solution for a couple hours.:. Here it is:

In the following text, I want to match the "boat" word...

  1. and include "bunch of " if place just before it.
  2. and include " propeller" if place just after it.
  3. or don't match if preceeded by "for a " even with "bunch of " in between.
  4. or don't match if followed by " trailer" even with "propeller " in between.

I have a boat to sell. It comes with extra boat propellers but does not come with a boat trailer (the boat is pretty big so you might need a boat propeller trailer too). I used to have a bunch of boats but my passion for a boat faded with time. I did not think people would have interest for a bunch of boats but this is my last one, so Yeéé: :)

The following parts should match:

  • boat ("boat")
  • bunch of boats ("boat" preceeded by "bunch of ")
  • boat propeller ("boat followed by " propeller")

The following parts should NOT match (not even partially):

  • for a boat ("boat" preceeded by "for a ")
  • boat trailer ("boat followed by " trailer")
  • for a bunch of boats ("boat" preceeded by "bunch of " which is preceeded by "for a ")
  • boat propeller trailer ("boat" followed by " propeller" which is followed by " trailer")

I got this example setup in regex 101 ( https://regex101.com/r/o6S4SP/22 ) but it's not working properly:-(

PS: I'm using Regex101 for the example but "(SKIP)(FAIL)" is not supported in Java's regex syntax.

Hope anyone could help:-)

You may use the following regex in Java that features a constrained-width lookbehind pattern (supporting limiting quantifiers):

(?<!\bfor\sa\s(?:bunch\sof\s){0,1})(?:\bbunch\s+of\s+)?\bboats?\b(?:\s+propellers?)?+(?!\s+trailers?\b)

See the Java regex demo online (proof) .

In Java,

s = s.replaceAll("(?<!\\bfor\\sa\\s(?:bunch\\sof\\s){0,1})(?:\\bbunch\\s+of\\s+)?\\bboats?\\b(?:\\s+propellers?)?+(?!\\s+trailers?\\b)", "<b>$0</b>");

Regex details

  • (?<?\bfor\sa\s(:,bunch\sof\s){0,1}) - a negative lookbehind that fails the match if, immediately to the left of the current location, there is
    • \bfor\sa\s - for , whitespace, a , whitespace
    • (?:bunch\sof\s){0,1} - 0 or 1 occurrences (ie an optional occurrence) of bunch , whitespace, of , whitespace
  • (?:\bbunch\s+of\s+)? - an optional occurrence of bunch , 1+ whitespaces, of , 1+ whitespaces
  • \bboats?\b - a whole word boat or boats
  • (?:\s+propellers?)?+ - an optional occurrence of 1+ whitespaces followed with propeller or propellers . NOTE : the ?+ possessive quantifier is key here to make the next lookahead only execute after this group pattern.
  • (??\s+trailers?\b) - a negative lookahead that fails the match if, immediately to the right of the current location, there is 1+ whitespaces, and then trailer or trailers as a whole word.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM