简体   繁体   中英

RegEx to match tokens and keep delimiter

I am using this RegEx (vb.net) to match all tokens of a string and to keep the delimiters (separate capturing groups):

([^~\\+\\:]*)([~\\+\\:])

Text1+Text2::Text4::Text6~Text1+Text2:Text3+Text4~

Output:

Text1 + Text2 : : Text4 : : Text6 ~ Text1 + Text2 : Text3 + Text4 ~

How can I achieve the same with ? as escape delimiter (odd number of ? )?

Text1+Text2?:Text3~

should result in

Text1 + Text2?:Text3 ~

Thanks for your help

Try this:

((?:\?.|[^~+:])*)([~+:])

Demo


This will not necessarily escape a ?a into a or ?? into ? , so you will need to do some post-processing without regex. However, a ? will effectively escape the next character. So: ?: will not be a delimiter, ??: will be a delimiter, ???: will not be a delimiter.


Explanation:

(           (?# start capture group for text)
  (?:       (?# start non-capture group for repeating alternation)
    \?.     (?# match ? literally followed by any character -- escaping)
   |        (?# OR)
    [^~+:]  (?# match any non-delimiter characters ~, +, and :)
  )*        (?# repeat non-capture group 0+ time)
)           (?# end capture group)
(           (?# start capture group for delimiter)
  [~+:]     (?# match any delimiter character ~, +, and :)
)           (?# end capture group)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM