简体   繁体   中英

Python RegEx for exact matches of brackets

I am trying to parse a string which is of the following format:

 text="some random string <inAngle> <anotherInAngle> [-option text] [-anotherOption <text>] [-option (Y|N)]" 

I want to split the string in three parts.

  1. Just the "some random string"
  2. Everything that is ONLY in angle brackets. IE inAngle and anotherInAngle above.
  3. Everything that is in square brackets.

If I use the RegEx

re.findall(r'\[(.+?)\]', text)

It gives everything I need within square brackets. If I use the same RegEx with angle brackets however,

re.findall(r'<(.+?)>', text)

It gives the text which is within angle bracket that are within square brackets too. So for example "text" from above which is within [-anotherOption]. I do not want that. The RegEx for angle bracket match should only return "inAngle" "anotherInAngle" from above. What would be the RegEx for it?

Also how do I get only the first part ie "some random string". This string can have 2 or 3 number of words

You can simply disregard everything between square brackets before searching for things in angle brackets:

interm = re.sub(r'\[(.*?)\]', '', text)
re.findall(r'<(.+?)>', interm)

outputs

['inAngle', 'anotherInAngle']

then for matching the first part, match everything up to [ or < . Granted this wont work if a string is allowed to randomly have either of these symbols unclosed embedded in the first part:

re.findall(r'([^<\[]+)', text)[0]

outputs

some random string 

Try if this regex would capture what you need

\s*([^><[\]]+\b)|\[([^]]*)]|<([^>]*)>
  • \\s* preceded by optional whitespace
  • ([^><[\\]]+\\b) Group 1: Any non brackets until \\b (remove if undesired)
  • |\\[([^]]*)] or Group 2: What's inside square brackets
  • |<([^>]*)> or Group 3: What's inside angle brackets

See demo at regex101 (use "code generator" if needed)

<(.+?)>(?![^\[]*\])|\[(.+?)\]|((?!\s+)[^\[\]<>]+)

You can simply use this re.findall .See demo.

https://regex101.com/r/hE4jH0/10

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM