Python Regex排除某些前缀

Question

Given the following string 给定以下字符串

s = '"foo" "bar2baz_foo" foo( bar2baz_foo( p_foo p_foo.'

I need a regex such that 我需要这样的正则表达式

re.findall(regex, s)

gives 给

['foo', 'bar2baz_foo', 'foo', 'bar2baz_foo']

So it matches the first four "words" excluding the quotes and parentheses but not the last two. 因此，它与前四个“单词”匹配，但不包括引号和括号，但不匹配后两个。 I have tried a couple different things but nothing I can come up with actually works. 我尝试了几种不同的方法，但实际上我无法解决。

Hope someone here can help. 希望这里有人能帮忙。

Edit: I should add that I want to replace the results with something else and not just find it, ie I wanna use re.sub and not re.findall . 编辑：我应该补充一点，我想用其他东西代替结果，而不仅仅是找到它，即我想使用re.sub而不是re.findall 。 And also the string is the content of a text file in reality and therefore much longer. 而且字符串实际上是文本文件的内容，因此更长。 I just extracted the relevant bits. 我只是提取了相关的位。

Answer 1

If you're not hell-bent on a pure regex solution, you could use The Greatest Regex Trick Ever . 如果您对纯正则表达式解决方案不满意，可以使用The Greatest Regex Trick Ever 。

>>> s = '"foo" "bar2baz_foo" foo( bar2baz_foo( p_foo p_foo.'
>>> import re
>>> filter(None, re.findall(r'p_\w*|(\w+)', s))
['foo', 'bar2baz_foo', 'foo', 'bar2baz_foo']

Small demo for usage in re.sub : 用于re.sub小演示：

>>> re.sub(r'p_\w*|(\w+)', lambda m: 'WORD' if m.group(1) else m.group(), s)
'"WORD" "WORD" WORD( WORD( p_foo p_foo.'

Python Regex排除某些前缀

问题描述

1 个解决方案

解决方案1
0 已采纳 2017-10-07 15:34:05

Python Regex排除某些前缀

问题描述

1 个解决方案

解决方案1 0 已采纳 2017-10-07 15:34:05

解决方案1
0 已采纳 2017-10-07 15:34:05