简体   繁体   中英

fnmatch and recursive path match with `**`

Is there any built-in or straightforward way to match paths recursively with double asterisk, eg like zsh does?

For example, with

path = 'foo/bar/ham/spam/eggs.py'

I can use fnmatch to test it with

fnmatch(path, 'foo/bar/ham/*/*.py'

Although, I would like to be able to do:

fnmatch(path, 'foo/**/*.py')

I know that fnmatchmaps its pattern to regex , so in the words case I can roll my own fnmatch with additional ** pattern, but maybe there is an easier way

If you look into fnmatch source code closely, it internally converts the pattern to a regular expression, mapping * into .* (and not [^/]* or similar) and thus does not care anything for directory separators / - unlike UNIX shells:

while i < n:
    c = pat[i]
    i = i+1
    if c == '*':
        res = res + '.*'
    elif c == '?':
        res = res + '.'
    elif c == '[':
        ...

Thus

>>> fnmatch.fnmatch('a/b/d/c', 'a/*/c')
True
>>> fnmatch.fnmatch('a/b/d/c', 'a/*************c')
True

If you can live without using an os.walk loop, try:

glob2

formic

I personally use glob2:

import glob2
files = glob2.glob(r'C:\Users\**\iTunes\**\*.mp4')

Addendum:

As of Python 3.5, the native glob module supports recursive pattern matching:

import glob
files = glob.iglob(r'C:\Users\**\iTunes\**\*.mp4', recursive=True) 

For an fnmatch variant that works on paths, you can use a library called wcmatch which implements a globmatch function that matches a path with the same logic that glob crawls a filesystem with. You can control the enabled features with flags, in this case, we enable GLOBSTAR (using ** for recursive directory search).

>>> from wcmatch import glob
>>> glob.globmatch('some/file/path/filename.txt', 'some/**/*.txt', flags=glob.GLOBSTAR)
True

This snippet adds compatibility for **

import re
from functools import lru_cache
from fnmatch import translate as fnmatch_translate


@lru_cache(maxsize=256, typed=True)
def _compile_fnmatch(pat):
    # fixes fnmatch for recursive ** (for compatibilty with Path.glob)
    pat = fnmatch_translate(pat)
    pat = pat.replace('(?s:.*.*/', '(?s:(^|.*/)')
    pat = pat.replace('/.*.*/', '.*/')
    return re.compile(pat).match


def fnmatch(name, pat):
    return _compile_fnmatch(str(pat))(str(name)) is not None

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM