简体   繁体   中英

Optional groups in regular expressions

my current regex:

Products\\/([0-9-]+)\\.aspx

this matches product urls that are like products/1488382.aspx, products/1239499-2881839.aspx, etc. there is other checking to see if the product retrieved actually exists (ie someone doing -4-9381--2 would pass the regex but no product actually exists.).

there are other URLs that have the following format: products/some-meta-description-1488382.aspx

how can i match this some-meta-description-- part? I want to match the entire products/... url and then remove everything but the 1488382.aspx.

products\\/(?:some-meta-description-)?([0-9-]+)\\.aspx

关于正则表达式101的解释

Thanks to Yann Moisan for the inspiration, but I think he slightly underestimated the requirements.

If you want to skip an optional arbitrary alphanumeric prefix and then match one numeric-only final group as the real ID I think it would be like:

Products\\/(?:[0-9a-zA-Z-]*-)?([0-9-]+)\\.aspx (or drop the AZ if it's case-insensitive)

But if you want to match multiple numeric-only groups after an optional arbitrary alphanumeric prefix it gets a bit more complicated:

Products\\/(?:[0-9a-zA-Z-]*[a-zA-Z][0-9a-zA-Z]*-+)?([0-9-]+)\\.aspx

The idea here is that it looks for an arbitrary alphanumeric prefix up through the last group containing a letter and then expects one or more dashes (all of that optional) which separates it from the match on any number of numeric groups separated by dashes. If you need to match the numeric groups individually within that last group you can presumably just take the string of all of the numeric groups and just do a Split in C#.

This does not account for the meta-description including a final numeric group (if separated by a dash from the last letter-containing group).

This also assumes that C#/.NET regular expressions include the ?: syntax for a non-matching group. Regex 101 did not seem to have an option for their particular syntax. You can always remove the ?: and just ignore the meta-description match (or perhaps you actually might sometimes want it as well). Or it may just be ? after the ( without the colon? The ? after the ) is the "optional" qualifier.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM