简体   繁体   中英

Match specific pattern with regular expression

I've to make a regex to match exactly this kind of pattern here an example

JK+6.00,PP*2,ZZ,GROUPO

having a match for every group like

Match 1

  • JK
  • +
  • 6.00

Match 2

  • PP
  • *
  • 2

Match 3

  • ZZ

Match 4

  • GROUPO

So comma separated blocks of (2 to 12 all capitals letters) [optional (+ or *) and a (positive number 0[.0[0]])

This block successfully parse the pattern

(?P<block>(?P<subject>[A-Z]{2,12})(?:(?P<operation>\*|\+)(?P<value>\d+(?:.?\d{1,2})?))?)

we have the subject group

(?P<subject>[A-Z]{2,12})

The value

(?P<value>\d+(?:.?\d{1,2})?)

All the optional operation section (value within)

(?:(?P<operation>\*|\+)(?P<value>\d+(?:.?\d{1,2})?))?

But the regex must fail if the string doesn't match EXACTLY the pattern and that's the problem

I tried this but doesn't work

^(?P<block>(?P<subject>[A-Z]{2,12})(?:(?P<operation>\*|\+)(?P<value>\d+(?:.?\d{1,2})?))?)(?:,(?P=block))*$

Any suggestion?

PS. I use Python re

I'd personally go for a 2 step solution, first check that the whole string fits to your pattern, then extract the groups you want.

For the overall check you might want to use ^(?:[AZ]{2,12}(?:[*+]\\d+(?:\\.\\d{1,2})?)?(?:,|$))*$ as a pattern, which contains basically your pattern, the (?:,|$) to match the delimiters and anchors.

I have also adjusted your pattern a bit, to (?P<block>(?P<subject>[AZ]{2,12})(?:(?P<operation>[*+])(?P<value>\\d+(?:\\.\\d{1,2})?))?) . I have replaced (?:\\*|\\+) with [+*] in your operation pattern and \\. with .? in your value pattern.

A (very basic) python implementation could look like

import re
str='JK+6.00,PP*2,ZZ,GROUPO'
full_pattern=r'^(?:[A-Z]{2,12}(?:[*+]\d+(?:\.\d{1,2})?)?(?:,|$))*$'
extract_pattern=r'(?P<block>(?P<subject>[A-Z]{2,12})(?:(?P<operation>[*+])(?P<value>\d+(?:\.\d{1,2})?))?)'
if re.fullmatch(full_pattern, str):
    for match in re.finditer(extract_pattern, str):
        print(match.groups())

http://ideone.com/kMl9qu

I'm guessing this is the pattern you were looking for:

(2 different letter)+(time stamp),(2 of the same letter)*(1 number),(2 of the same letter),(a string)

If thats the case, this regex would do the trick:

^(\\w{2}\\+\\d{1,2}\\.\\d{2}),((\\w)\\3\\*\\d),((\\w)\\5),(\\w+)$

Demo: https://regex101.com/r/8B3C6e/2

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM