As the title suggest, I would like to split values on the |
character except when the |
character is nested in brackets [|]
.
For example, taking the text of:
H3609|E1.7|E1.3|D09[7|9]
where I would like to split out ["H3609", "E1.7", "E1.3", "D09[7|9]"]
So far I have tried something very basic like: [A-z0-9\.]*
would get back (assuming python using re. findall()
)
["H3609", "E1.7", "E1.3", "D09[7", "9]"]
any suggestions?
Thanks in advance!
You can use
re.findall(r'(?:\[[^][]*]|[^][|])+', text)
See the regex demo .
Details :
(?:
- start of a non-capturing group that groups two patterns:
\[[^][]*]
- a [
, then any zero or more chars other than [
and ]
and then a ]
char |
- or [^][|]
- any char but ]
, [
and |
)+
- repeat matching the group patterns one or more times. See a Python demo :
import re
text = 'H3609|E1.7|E1.3|D09[7|9]'
print( re.findall(r'(?:\[[^][]*]|[^][|])+', text) )
# => ['H3609', 'E1.7', 'E1.3', 'D09[7|9]']
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.