Hi I need to match cola xx
from :ca:cr:pr cola xx
but also be able to get cola xx
when no ca:cr:pr
occures. The number of tags starting with :
can be different and their length too.
>>> string
':ca:cr:pr cola xx'
>>> re.findall("\w+", string)
['ca', 'cr', 'pr', 'cola', 'xx']
>>> re.findall(":\w+", string)
[':ca', ':cr', ':pr']
>>> re.findall("^(:\w+)", string)
[':ca']
I was trying to use also lookbehinds ( http://runnable.com/Uqc1Tqv_MVNfAAGN/lookahead-and-lookbehind-in-regular-expressions-in-python-for-regex ) but unsecesffully.
>>> re.findall(r"(\s\w+)(?!:)",string)
[' cola', ' xx']
>>> string="cola"
>>> re.findall(r"(\s\w+)(?!:)",string)
[]
That is when no tags, only cola
it is not detected.
How can I improve my regex to work as expected?
desired examples once more:
:c cola xx
-> cola xx
:ca:c cola xx
-> cola xx
:ca:cr:pr cola xx
-> cola xx
cola xx
-> cola xx
cola
-> cola
I believe something like this should work, if I understood your requirement correctly:
(?<!:)\b\w+
In code:
results = re.findall(r'(?<!:)\b\w+', string)
为什么不将所有以冒号开头的单词全部替换为空呢?
result = re.sub(r":\w+\b", "", subject)
希望这会起作用
re.findall("(?<!:)(\w+)", string)
我会做类似的事情:
(?<!:)\w+(?:\s\w+)?
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.