match all words not starting with ':' Python regex

Question

Hi I need to match cola xx from :ca:cr:pr cola xx but also be able to get cola xx when no ca:cr:pr occures. The number of tags starting with : can be different and their length too.

>>> string
':ca:cr:pr cola xx'
>>> re.findall("\w+", string)
['ca', 'cr', 'pr', 'cola', 'xx']
>>> re.findall(":\w+", string)
[':ca', ':cr', ':pr']
>>> re.findall("^(:\w+)", string)
[':ca']

I was trying to use also lookbehinds ( http://runnable.com/Uqc1Tqv_MVNfAAGN/lookahead-and-lookbehind-in-regular-expressions-in-python-for-regex ) but unsecesffully.

>>> re.findall(r"(\s\w+)(?!:)",string)
[' cola', ' xx']
>>> string="cola"
>>> re.findall(r"(\s\w+)(?!:)",string)
[]

That is when no tags, only cola it is not detected.

How can I improve my regex to work as expected?

desired examples once more:

:c cola xx -> cola xx

:ca:c cola xx -> cola xx

:ca:cr:pr cola xx -> cola xx

cola xx -> cola xx

cola -> cola

Answer 1

I believe something like this should work, if I understood your requirement correctly:

(?<!:)\b\w+

regex101 demo

In code:

results = re.findall(r'(?<!:)\b\w+', string)

Answer 2

为什么不将所有以冒号开头的单词全部替换为空呢？

result = re.sub(r":\w+\b", "", subject)

Answer 3

希望这会起作用

re.findall("(?<!:)(\w+)", string)

Answer 4

我会做类似的事情：

(?<!:)\w+(?:\s\w+)?

match all words not starting with ':' Python regex

Question

4 answers

solution1
5 ACCPTED 2014-07-02 10:58:04

solution2
0 2014-07-02 10:57:59

solution3
0 2014-07-02 10:58:36

solution4
0 2014-07-02 10:59:07

match all words not starting with ':' Python regex

Question

4 answers

solution1 5 ACCPTED 2014-07-02 10:58:04

solution2 0 2014-07-02 10:57:59

solution3 0 2014-07-02 10:58:36

solution4 0 2014-07-02 10:59:07

solution1
5 ACCPTED 2014-07-02 10:58:04

solution2
0 2014-07-02 10:57:59

solution3
0 2014-07-02 10:58:36

solution4
0 2014-07-02 10:59:07