简体   繁体   中英

Regex: a number vs. a backreference to a capture group

I've been studying regular expressions, and I'm scratching my head on this one. On this page ( https://www.regular-expressions.info/conditional.html ) I see that, in a conditional regex, a reference to a numbered backreference is just a number. For example,

(a)?b(?(1)c|d)

How does regex know that we aren't supposed to match the number "1" instead of the backreference to the 1st capture group? Previously in the lessons I had learned that a backreference would be escaped, such as \\1, \\2, etc.

As per the regex tutorial you're following:

A special construct (?ifthen|else) allows you to create conditional regular expressions . If the if part evaluates to true, then the regex engine will attempt to match the then part. Otherwise, the else part is attempted instead. The syntax consists of a pair of parentheses. The opening bracket must be followed by a question mark , immediately followed by the if part, immediately followed by the then part. This part can be followed by a vertical bar and the else part. You may omit the else part, and the vertical bar with it.

Alternatively, you can check in the if part whether a capturing group has taken part in the match thus far. Place the number of the capturing group inside parentheses , and use that as the if part.

Your second question is this:

RegEx Demo of \\b(a)?b(?(1)c|d)\\b

Note that I have added word boundary to avoid matching string like abd partially.

What if someone actually wanted to match the literal 1 this way?

valid input: 1c or d invalid input: 1d

That would be:

\b(1)?(?(1)c|d)\b

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM