繁体   English   中英

正则表达式匹配列表中的项目 + 尾随 N 数字(Python)

[英]Regex match items in list + trailing N numbers (Python)

我有一份预期动物的清单:

expectedAnimals = ['cat-', 'snake-', 'hedgehog-']

然后我有一个用户输入(字符串格式),其中包含上面列表中的部分或全部预期动物,后跟 N 个数字。 这些动物由随机分隔符号(非整数)分隔:

例子:

inputString1 = 'cat-235##randomtext-123...snake-1,dog-2:snake-22~!cat-8844'
inputString2 = 'hedgehog-2>cat-1|snake-22#cat-2<$dog-55 snake-93242522. cat-3 .rat-2 snake-22 cat-8844'

我的目标(我正在努力实现)是编写 function filterAnimals 应该返回以下正确结果:

approvedAnimals1 = filterAnimals(inputString1)

['cat-235', 'snake-1', 'snake-22', 'cat-8844']

approvedAnimals2 = filterAnimals(inputString2):

['hedgehog-2', 'cat-1', 'snake-22', 'cat-2', 'snake-93242522', 'cat-3', 'snake-22', 'cat-8844']

我当前的实现部分有效,但老实说我想从头开始重写它:

def filterAnimals(inputString):
    expectedAnimals = ['cat-', 'snake-', 'hedgehog-']
    start_indexes = []
    end_indexes = []
    for animal in expectedAnimals:
        temp_start_indexes = [i for i in range(len(inputString)) if inputString.startswith(animal, i)]
        if len(temp_start_indexes) > 0:
            start_indexes.append(temp_start_indexes)
            for start_ind in temp_start_indexes:
                for i in range(start_ind + len(animal), len(inputString)):
                    if inputString[i].isdigit() and i == len(inputString) - 1:
                        end_indexes.append(i + 1)
                        break
                    if not inputString[i].isdigit():
                        end_indexes.append(i)
                        break
        start_indexes_flat = [item for sublist in start_indexes for item in sublist]
        list_size = min(len(start_indexes_flat), len(end_indexes))
        approvedAnimals = []
        if list_size > 0:
            for x in range(list_size):
                approvedAnimals.append(inputString[start_indexes_flat[x]:end_indexes[x]])
    return approvedAnimals

您可以从expectedAnimals构建交替模式并使用re.findall查找所有匹配项作为列表:

import re

def filterAnimals(inputString):
    return re.findall(rf"(?:{'|'.join(expectedAnimals)})\d+", inputString)

演示: https://replit.com/@blhsing/OffensiveEveryWebportal

import re # matches expected animals followed by N numbers pattern=re.compile("(cat|snake|hedgehog)-\d+") inputString1 = 'cat-235##randomtext-123...snake-1,dog-2:snake-22~!cat-8844' inputString2 = 'hedgehog-2>cat-1|snake-22#cat-2<$dog-55 snake-93242522. cat-3 .rat-2 snake-22 cat-8844' animals_1 = [i.group() for i in pattern.finditer(inputString1)] # will return ['cat-235', 'snake-1', 'snake-22', 'cat-8844'] animals_2 = [i.group() for i in pattern.finditer(inputString2)] # will return ['hedgehog-2', 'cat-1', 'snake-22', 'cat-2', 'snake-93242522', 'cat-3', 'snake-22', 'cat-8844']

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM