简体   繁体   中英

Regular Expressions - Match Character Not Between Two Strings

I've read many questions that ask about finding a regular expression to match characters between two strings, but my problem is the inverse. I'm attempting to create an expression that will match characters NOT between two strings.

Consider the following string.

This is short & [tag]fun & interesting[/tag].

I want to replace any ampersand character that is NOT inside the tag elements with the symbol @. The result should be as shown below.

This is short @ [tag]fun & interesting[/tag].

I tried the following regular expression, but unfortunately, it matches the ampersand inside the tag elements.

/(?<!\[tag\])&(?!\[\/tag\])/g

I understand that it matches that ampersand because it's surrounded by characters on either side in the string. But I can't add a random number of characters to check because the lookbehind and lookahead must be fixed length.

Is there a regular expression that will accomplish what I want here?

This does the job even with nested tag:

  • Find: \\[(\\w+)\\].+?\\[/\\1\\](*SKIP)(*FAIL)|&
  • Replace: @

Demo & explanation

How it works:

  • \\[(\\w+)\\].+?\\[/\\1\\] is trying to match opening and closing tag with some data inside
  • (*SKIP)(*FAIL) if tag is found, then discard it
  • | else
  • & match an ampersand. At this point, we are sure it is not inside a tag.

Unfortunately this doesn't work with Java, but this requirement was only added after I answered.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM