简体   繁体   English

正则表达式:组函数

[英]regular expression : group function

I want to extract newsletter_ and _mon_gallery from the phrase 我想从短语中提取newsletter__mon_gallery

002c2833d0-newsletter_20131028_mon_gallery

I tried with ([^\\d-_]+){3,} 我试过([^\\d-_]+){3,}

002c2833d0-newsletter_20131028_mon_gallery

So i can check in http://www.regexpal.com/ . 所以我可以查看http://www.regexpal.com/ it is visually separating the two entities newsletter_ and _mon_gallery . 它在视觉_mon_gallery两个实体newsletter__mon_gallery

But the problem is i am not able to retrieve the matched values from group function. 但问题是我无法从组函数中检索匹配的值。

import re
string='002c2833d0-newsletter_20131028_mon_enamour'
m=re.search('([^\d-]+){3,}',string)
print m.group()

i just get 我得到了

newsletter_

re.search() is designed to return the first match. re.search()旨在返回第一个匹配项。 You want 你要

m = re.findall('[^\d-]{3,}',string)

Note that I've edited your regex to remove the nested quantifiers (can you say " catastrophic backtracking "?) and the unnecessary (and harmful if repeated) capturing group. 请注意,我已经编辑了你的正则表达式以删除嵌套量词(你能说“ 灾难性的回溯 ”吗?)和不必要的(如果重复的话有害)捕获组。

You can use findall , but you have to slightly change the regular expression from this: 你可以使用findall ,但你必须稍微改变一下这个正则表达式:

([^\d-]+){3,}

to this: 对此:

([^\d-]{3,})

(In general, there's no need to have both + and {3,} together as the latter implies the first.) (一般来说,没有必要同时使用+{3,} ,因为后者意味着第一个。)

>>> re.findall('[^\d-]{3,}', string)
['newsletter_', '_mon_enamour']

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM