Normally when doing a regex you can do [regex]{n} to indicate that you want the regex to apply to n characters. Or you can do {n,m} to mean n through m characters.
What about individually? For example if I wanted to do {4 or 8 or 12}?
Alternation will do the job
A{4}|A{8}|A{12}
But if A is a big regex you will be duplicating a lot which is not good. Don't some regex engines allow to define a sub regex and later reuse it. I'm interested if this exists, but I use .NET which does not support it inside the regex.
Of course nothing stands in the way by embedding a string variable a few times from the host languages in the regex.
Update 1
A{12}|A{8}|A{4}
can match something different than
A{4}|A{8}|A{12}
The former one can be labeled as greedy, while the latter lazy.
The latter will match the first 4 A's in AAAAAAAA while the former will match 8 A's.
The default behavior of a quantifier is greedy but since you can't make this hand made construct lazy with a ? it just depends on what you need when choosing between the 2. If you embed it in a regex you sometimes want lazy behavior. Not embedded the former is more than likely what you intended.
{m, n}
is just shorthand for repeated alternation. That is, A{4,5}
is just short for AAAA|AAAAA
. As Kevin points out in a comment, you may be able to represent an arbitrary set of lengths as a continues range of concatenations, but in general that's not possible. For example, any finite set of prime numbers (in unary notation) could be matched by a regular expression:
11|111|11111|1111111|11111111111 # Your hypothetical 1{2 or 3 or 5 or 7 or 11}
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.