[英]Regex match items in list + trailing N numbers (Python)
我有一份预期动物的清单:
expectedAnimals = ['cat-', 'snake-', 'hedgehog-']
然后我有一个用户输入(字符串格式),其中包含上面列表中的部分或全部预期动物,后跟 N 个数字。 这些动物由随机分隔符号(非整数)分隔:
例子:
inputString1 = 'cat-235##randomtext-123...snake-1,dog-2:snake-22~!cat-8844'
inputString2 = 'hedgehog-2>cat-1|snake-22#cat-2<$dog-55 snake-93242522. cat-3 .rat-2 snake-22 cat-8844'
我的目标(我正在努力实现)是编写 function filterAnimals 应该返回以下正确结果:
approvedAnimals1 = filterAnimals(inputString1)
['cat-235', 'snake-1', 'snake-22', 'cat-8844']
approvedAnimals2 = filterAnimals(inputString2):
['hedgehog-2', 'cat-1', 'snake-22', 'cat-2', 'snake-93242522', 'cat-3', 'snake-22', 'cat-8844']
我当前的实现部分有效,但老实说我想从头开始重写它:
def filterAnimals(inputString):
expectedAnimals = ['cat-', 'snake-', 'hedgehog-']
start_indexes = []
end_indexes = []
for animal in expectedAnimals:
temp_start_indexes = [i for i in range(len(inputString)) if inputString.startswith(animal, i)]
if len(temp_start_indexes) > 0:
start_indexes.append(temp_start_indexes)
for start_ind in temp_start_indexes:
for i in range(start_ind + len(animal), len(inputString)):
if inputString[i].isdigit() and i == len(inputString) - 1:
end_indexes.append(i + 1)
break
if not inputString[i].isdigit():
end_indexes.append(i)
break
start_indexes_flat = [item for sublist in start_indexes for item in sublist]
list_size = min(len(start_indexes_flat), len(end_indexes))
approvedAnimals = []
if list_size > 0:
for x in range(list_size):
approvedAnimals.append(inputString[start_indexes_flat[x]:end_indexes[x]])
return approvedAnimals
您可以从expectedAnimals
构建交替模式并使用re.findall
查找所有匹配项作为列表:
import re
def filterAnimals(inputString):
return re.findall(rf"(?:{'|'.join(expectedAnimals)})\d+", inputString)
import re # matches expected animals followed by N numbers pattern=re.compile("(cat|snake|hedgehog)-\d+") inputString1 = 'cat-235##randomtext-123...snake-1,dog-2:snake-22~!cat-8844' inputString2 = 'hedgehog-2>cat-1|snake-22#cat-2<$dog-55 snake-93242522. cat-3 .rat-2 snake-22 cat-8844' animals_1 = [i.group() for i in pattern.finditer(inputString1)] # will return ['cat-235', 'snake-1', 'snake-22', 'cat-8844'] animals_2 = [i.group() for i in pattern.finditer(inputString2)] # will return ['hedgehog-2', 'cat-1', 'snake-22', 'cat-2', 'snake-93242522', 'cat-3', 'snake-22', 'cat-8844']
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.