简体   繁体   English

在重复和非重复的正则表达式中捕获多个可选组

[英]Capturing multiple optional groups in a regex both repeating and non repeating

I have to match an expression similar to these我必须匹配类似于这些的表达式

STAR 13

STAR 13, 23

STAR 1, 2 and 3 and STAR 1

But only capture the digits.但只捕获数字。 The number of digits is unspecified.位数未指定。

I've tried with STAR(?:\\s*(?:,|and)\\s*(#\\d+))+ But it doesn't seem to capture the terms exactly.我试过STAR(?:\\s*(?:,|and)\\s*(#\\d+))+但它似乎没有准确地捕捉到这些术语。 No other dependencies could be added.无法添加其他依赖项。 Just the re module only.只是re模块而已。

The problem is a much larger one where STAR is another regular expression which has already been solved.问题是一个更大的问题,其中STAR是另一个已经解决的正则表达式。 Please don't bother about it and just consider it as a letter combination.请不要理会它,只需将其视为一个字母组合即可。 Just include the letters STAR in regular expressions.只需在正则表达式中包含字母STAR

If you don't know the number of the digit r'[0-9]+' to specifie 1 digit or more.如果您不知道数字r'[0-9]+'数字,请指定 1 位或更多。 And to capture all number, you can use : r'(\\d+)'并捕获所有数字,您可以使用: r'(\\d+)'

Do it with one regex:用一个正则表达式来做:

re.findall("STAR ([0-9]+),? ?([0-9]+)? ?a?n?d? ?([0-9]+)?",a)

[('13', '', '')] [('13', '', '')]

[('13', '23', '')] [('13', '23', '')]

[('1', '2', '3'), ('1', '', '')] [('1', '2', '3'), ('1', '', '')]

May be esaier and cleaner resultut with two step, first you need to have variable in a list like that:可以通过两个步骤获得更简单和更清晰的结果,首先你需要在这样的列表中有变量:

tab = ["STAR 13","STAR 13, 23","STAR 1, 2 and 3 and STAR 1"] tab = ["STAR 13","STAR 13, 23","STAR 1, 2 and 3 and STAR 1"]

list = filter(lambda x: re.match("^STAR",x),tab)
list_star = filter(lambda x: re.match("^STAR",x),tab)
for i in list_star:
    re.findall(r'\d+', i)

['13'] ['13']

['13', '23'] ['13', '23']

['1', '2', '3', '1'] ['1', '2', '3', '1']

You just need to put it in a new list after that my_digit += re.findall(r'\\d+', i)你只需要在my_digit += re.findall(r'\\d+', i)

In 1 line:在 1 行中:

import functools
tab = ["STAR 13","STAR 13, 23","STAR 1, 2 and 3 and STAR 1"]
digit=functools.reduce(lambda x,y: x+re.findall("\d+",y),filter(lambda x: re.match("^STAR ",x),tab),[])

['13', '13', '23', '1', '2', '3', '1'] ['13', '13', '23', '1', '2', '3', '1']

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM