简体   繁体   English

检查字符串是否是可能的正则表达式匹配的前缀

[英]Checking if a string is a prefix of a possible regex match

I want to traverse a tree structure, but only those parts that match a wildcard expression, a la python glob, where double asterisk means 'any number of subdirs'. 我想遍历树结构,但是只遍历那些与通配符表达式匹配的部分,即la python glob,其中双星号表示“任意数量的子目录”。

For example, say my wildcard expression is /*/foo/**/bar/. 例如,假设我的通配符表达式为/ * / foo / ** / bar /。 This would match /a/foo/bar/, /b/foo/note/bar/, but not /a/bar/foo/bar/. 这将匹配/ a / foo / bar /,/ b / foo / note / bar /,但不匹配/ a / bar / foo / bar /。 You get the idea. 你明白了。

My problem is that when traversing the tree structure, I need to know whether the current dir could possibly match the wildcard expression as a prefix. 我的问题是遍历树结构时,我需要知道当前目录是否可能与通配符表达式作为前缀匹配。 So I do want to traverse the directory /a/, but not /a/bar/, because I know the latter will never match the wildcard expression. 所以我确实想遍历目录/ a /,而不是/ a / bar /,因为我知道后者永远不会与通配符表达式匹配。

The wildcard expression I will rewrite to a regular expression, of course. 当然,我将通配符表达式重写为正则表达式。

Consider the following code for starters. 为初学者考虑以下代码。 I assume you have each "directory" in the path and pattern as elements in a pair of lists: 我假设您将路径和模式中的每个“目录”作为一对列表中的元素:

def traverse(pattern_list, path_list):
    if pattern_list[0] == '**':
        traverse_children(pattern_list, path_list[1:])
    if current_matches(pattern_list[0], path_list[0]):
        traverse_children(pattern_list[1:], path_list[1:])
        # Other things you might want to do in the case of a valid prefix

def current_matches(pattern_atom, path_atom):
    return pattern_atom in (path_atom, '*', '**')

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM