Python正则表达式匹配：## ##

Question

I'm searching a file line by line for the occurrence of ##random_string##. 我正在逐行搜索文件中是否出现## random_string ##。 It works except for the case of multiple #... 它适用于多个＃...的情况

pattern='##(.*?)##'
prog=re.compile(pattern)

string='lala ###hey## there'
result=prog.search(string)

print re.sub(result.group(1), 'FOUND', string)

Desired Output: 所需输出：

"lala #FOUND there"

Instead I get the following because its grabbing the whole ###hey##: 相反，我得到以下内容，因为它抓住了整个### hey ##：

"lala FOUND there"

So how would I ignore any number of # at the beginning or end, and only capture "##string##". 因此，我将如何忽略开头或结尾的任意数量的＃，而仅捕获“ ## string ##”。

Answer 1

至少在两端匹配至少两个哈希：

pattern='##+(.*?)##+'

Answer 2

Your problem is with your inner match. 你的问题在于你的内心匹配。 You use . 您使用. , which matches any character that isn't a line end, and that means it matches # as well. ，它匹配不是行尾的任何字符，这意味着它也匹配# 。 So when it gets ###hey## , it matches (.*?) to #hey . 因此，当它得到###hey## ，它将(.*?)与#hey 。

The easy solution is to exclude the # character from the matchable set: 一个简单的解决方案是从可匹配的集合中排除#字符：

prog = re.compile(r'##([^#]*)##')

Protip: Use raw strings (eg r'' ) for regular expressions so you don't have to go crazy with backslash escapes. 提示：对正则表达式使用原始字符串（例如r'' ），这样就不必担心反斜杠转义。

Trying to allow # inside the hashes will make things much more complicated. 试图让#进入哈希值将使事情变得更加复杂。

EDIT: If you do not want to allow blank inner text (ie "####" shouldn't match with an inner text of ""), then change it to: 编辑：如果您不想允许内部文本为空白（即“ ####”不应与内部文本“”匹配），则将其更改为：

prog = re.compile(r'##([^#]+)##')

+ means "one or more." +表示“一个或多个”。

Answer 3

'^#{2,}([^#]*)#{2,}' -- any number of # >= 2 on either end '^#{2,}([^#]*)#{2,}' -两端任意数量的＃> = 2

be careful with using lazy quantifiers like (.*?) because it'd match '##abc#####' and capture 'abc###'. 请谨慎使用（。*？）之类的惰性量词，因为它会匹配“ ## abc #####”并捕获“ abc ###”。 also lazy quantifiers are very slow 懒惰的量词也很慢

Answer 4

Try the "block comment trick": /##((?:[^#]|#[^#])+?)##/ 尝试“阻止注释技巧”： /##((?:[^#]|#[^#])+?)##/ 工作示例的屏幕截图

Answer 5

Adding + to regex, which means to match one or more character. 在正则表达式中添加+，表示匹配一个或多个字符。

pattern='#+(.*?)#+'
prog=re.compile(pattern)

string='###HEY##'
result=prog.search(string)
print result.group(1)

Output: 输出：

HEY

Answer 6

have you considered doing it non-regex way? 您是否考虑过采用非正则表达式的方式？

>>> string='lala ####hey## there'
>>> string.split("####")[1].split("#")[0]
'hey'

Answer 7

>>> import re
>>> text= 'lala ###hey## there'
>>> matcher= re.compile(r"##[^#]+##")
>>> print matcher.sub("FOUND", text)
lala #FOUND there
>>>

Python正则表达式匹配：## ##

问题描述

7 个解决方案

解决方案1
3 已采纳 2010-10-23 01:17:59

解决方案2
3 2010-10-23 02:56:40

解决方案3
1 2010-10-23 01:17:25

解决方案4
0 2010-10-23 01:19:33

解决方案5
0 2010-10-23 01:21:35

解决方案6
0 2010-10-23 01:45:00

解决方案7
0 2010-10-24 13:13:17

Python正则表达式匹配：## ##

问题描述

7 个解决方案

解决方案1 3 已采纳 2010-10-23 01:17:59

解决方案2 3 2010-10-23 02:56:40

解决方案3 1 2010-10-23 01:17:25

解决方案4 0 2010-10-23 01:19:33

解决方案5 0 2010-10-23 01:21:35

解决方案6 0 2010-10-23 01:45:00

解决方案7 0 2010-10-24 13:13:17

解决方案1
3 已采纳 2010-10-23 01:17:59

解决方案2
3 2010-10-23 02:56:40

解决方案3
1 2010-10-23 01:17:25

解决方案4
0 2010-10-23 01:19:33

解决方案5
0 2010-10-23 01:21:35

解决方案6
0 2010-10-23 01:45:00

解决方案7
0 2010-10-24 13:13:17