[英]regular expression : group function
I want to extract newsletter_
and _mon_gallery
from the phrase 我想从短语中提取newsletter_
和_mon_gallery
002c2833d0-newsletter_20131028_mon_gallery
I tried with ([^\\d-_]+){3,}
我试过([^\\d-_]+){3,}
002c2833d0-newsletter_20131028_mon_gallery
So i can check in http://www.regexpal.com/ . 所以我可以查看http://www.regexpal.com/ 。 it is visually separating the two entities newsletter_
and _mon_gallery
. 它在视觉_mon_gallery
两个实体newsletter_
和_mon_gallery
。
But the problem is i am not able to retrieve the matched values from group function. 但问题是我无法从组函数中检索匹配的值。
import re
string='002c2833d0-newsletter_20131028_mon_enamour'
m=re.search('([^\d-]+){3,}',string)
print m.group()
i just get 我得到了
newsletter_
re.search()
is designed to return the first match. re.search()
旨在返回第一个匹配项。 You want 你要
m = re.findall('[^\d-]{3,}',string)
Note that I've edited your regex to remove the nested quantifiers (can you say " catastrophic backtracking "?) and the unnecessary (and harmful if repeated) capturing group. 请注意,我已经编辑了你的正则表达式以删除嵌套量词(你能说“ 灾难性的回溯 ”吗?)和不必要的(如果重复的话有害)捕获组。
You can use findall
, but you have to slightly change the regular expression from this: 你可以使用findall
,但你必须稍微改变一下这个正则表达式:
([^\d-]+){3,}
to this: 对此:
([^\d-]{3,})
(In general, there's no need to have both +
and {3,}
together as the latter implies the first.) (一般来说,没有必要同时使用+
和{3,}
,因为后者意味着第一个。)
>>> re.findall('[^\d-]{3,}', string)
['newsletter_', '_mon_enamour']
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.