正则表达式：组函数

Question

I want to extract newsletter_ and _mon_gallery from the phrase 我想从短语中提取newsletter_和_mon_gallery

002c2833d0-newsletter_20131028_mon_gallery

I tried with ([^\\d-_]+){3,} 我试过([^\\d-_]+){3,}

002c2833d0-newsletter_20131028_mon_gallery

So i can check in http://www.regexpal.com/ . 所以我可以查看http://www.regexpal.com/ 。 it is visually separating the two entities newsletter_ and _mon_gallery . 它在视觉_mon_gallery两个实体newsletter_和_mon_gallery 。

But the problem is i am not able to retrieve the matched values from group function. 但问题是我无法从组函数中检索匹配的值。

import re
string='002c2833d0-newsletter_20131028_mon_enamour'
m=re.search('([^\d-]+){3,}',string)
print m.group()

i just get 我得到了

newsletter_

Answer 1

re.search() is designed to return the first match. re.search()旨在返回第一个匹配项。 You want 你要

m = re.findall('[^\d-]{3,}',string)

Note that I've edited your regex to remove the nested quantifiers (can you say " catastrophic backtracking "?) and the unnecessary (and harmful if repeated) capturing group. 请注意，我已经编辑了你的正则表达式以删除嵌套量词（你能说“ 灾难性的回溯 ”吗？）和不必要的（如果重复的话有害）捕获组。

Answer 2

You can use findall , but you have to slightly change the regular expression from this: 你可以使用findall ，但你必须稍微改变一下这个正则表达式：

([^\d-]+){3,}

to this: 对此：

([^\d-]{3,})

(In general, there's no need to have both + and {3,} together as the latter implies the first.) （一般来说，没有必要同时使用+和{3,} ，因为后者意味着第一个。）

>>> re.findall('[^\d-]{3,}', string)
['newsletter_', '_mon_enamour']

正则表达式：组函数

问题描述

2 个解决方案

解决方案1
2 已采纳 2015-11-19 11:33:38

解决方案2
2 2015-11-19 11:33:50

正则表达式：组函数

问题描述

2 个解决方案

解决方案1 2 已采纳 2015-11-19 11:33:38

解决方案2 2 2015-11-19 11:33:50

解决方案1
2 已采纳 2015-11-19 11:33:38

解决方案2
2 2015-11-19 11:33:50