简体   繁体   中英

Python Regular expression to match printing pages and their range

I have this regular expression that matches any print page specifications (ex: 6, 1-6, 6:4, 10-20/3 )

^([1-9]\d*)((?<=\d)[-]|[:]?)((?<=-|:)?[1-9]\d*)?(?:(?<=)([/]?))([1-9]\d*)?$

and I have it so that it currently matches: ex: 2048-4096/100 , 15:10/3

However, my regular expression also matches 5/3 when / should only follow a colon or dash, and some digits, like 2048-4096/100

In the empty positive lookbehind in the above expression I've tried: (?:(?<=[:|-]\\d)([/]?)) but that causes all my tests to fail, resulting in no matches. I've also tried (?:(?<=[:|-]\\d*)([/]?)) but quantifiers are not allowed in the lookbehind.

What can I put in the empty positive lookbehind to make it so that it will check if a : or - and digits are before the / ?

You can use

^([1-9]\d*)(?:([-:])([1-9]\d*)(?:(/)([1-9]\d*))?)?$

See the regex demo . Details :

  • ^ - start of string
  • ([1-9]\\d*) - Group 1: a non-zero digit and then zero or more digits
  • (?:([-:])([1-9]\\d*)(?:(/)([1-9]\\d*))?)? - an optional occurrence of
    • ([-:]) - Group 2: - or :
    • ([1-9]\\d*) - Group 3: a non-zero digit and then zero or more digits
    • (?:(/)([1-9]\\d*))? - an optional occurrence of
      • (/) - Group 4: /
      • ([1-9]\\d*) - Group 5: a non-zero digit and then zero or more digits
  • $ - end of string.

I kept all groups intact, but at least (/) group is redundant as the pattern is fixed as / .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM