简体   繁体   中英

regular expression to parse option string in python

I can't seem to create the correct regular expression to extract the correct tokens from my string. Padding the beginning of the string with a space generates the correct output, but seems less than optimal:

>>> import re
>>> s = '-edge_0triggered a-b | -level_Sensitive c-d | a-b-c'
>>> re.findall(r'\W(-[\w_]+)',' '+s)
['-edge_0triggered', '-level_Sensitive'] # correct output

Here are some of the regular expressions I've tried, does anyone have a regex suggestion that doesn't involve changing the original string and generates the correct output

>>> re.findall(r'(-[\w_]+)',s)
['-edge_0triggered', '-b', '-level_Sensitive', '-d', '-b', '-c']
>>> re.findall(r'\W(-[\w_]+)',s)
['-level_Sensitive']

Change the first qualifier to accept either a beginning anchor or a not-word, instead of only a not-word:

>>> re.findall(r'(?:^|\W)(-[\w_]+)', s)
['-edge_0triggered', '-level_Sensitive']

The ?: at the beginning of the group simply tells the regex engine to not treat that as a group for purposes of results.

r'(?:^|\W)(-\w+)'

\\w已经包含下划线。

You could use a negative-lookbehind:

re.findall(r'(?<!\w)(-\w+)', s)

the (?<!\\w) part means "match only if not preceded by a word-character".

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM