简体   繁体   中英

How to get all directories containing specific file limited by depth in Python?

What I'm trying to do

I'm trying to get a list of subdirectories up to depth of x containing a file y, such as the following:

root/
root/one/
root/one/README.md
root/two/subdir/
root/two/subdir/README.md
root/three/

In the above tree, I'm trying to identify all subdirectories up to level x that contain the file README.md .

  • If x = 1 , it should return root/one/
  • If x = 2 , it should return root/one/ and root/two/subdir

What I've tried

Most answers I can find use os.walk to loop through directories and identify files. While this works, it doesn't allow for depth limit on recursion. Some workarounds include creating a depth-limited os.walk function and others use glob patterns to specify depth such as */*/* for depth of 3 or generated glob pattern . While both of these work, I'm wondering if a cleaner approach exists to my problem?

os.walk hack

def walklevel(some_dir, level=1):
    some_dir = some_dir.rstrip(os.path.sep)
    assert os.path.isdir(some_dir)
    num_sep = some_dir.count(os.path.sep)
    for root, dirs, files in os.walk(some_dir):
        yield root, dirs, files
        num_sep_this = root.count(os.path.sep)
        if num_sep + level <= num_sep_this:
            del dirs[:]

glob hack

def fileNamesRetrieve(top, maxDepth, fnMask):
    someFiles = []
    for d in range(1, maxDepth+1):
        maxGlob = "/".join("*" * d)
        topGlob = os.path.join(top, maxGlob)
        allFiles = glob.glob(topGlob)

sample of code

def get_subdirs(dir):
    file = "README.md"
    subdirs = []
    # something like this looping through subdirs
    for subdir in subdirs(dir, depth=1):
        if any(f for f in subdirs.files if f = file)
            subdirs.append(subdir)
    return subdirs

Question

How should I approach this problem?

I am sure there is a built in way to do achieve your goal, but a simple recursive algorithm you can try that will do what you ask is:

from pathlib import Path

MAXDEPTH = 3

def traverse(path, depth, loc):
    if path.is_file():
        if path.name == "README.md":     # Checking filename...
            loc.append(str(path.resolve()))  # append to list if match
            # print(str(path))  <- you can do this if you dont need the return value
    elif path.is_dir() and depth <= MAXDEPTH:  # check depth 
        for item in path.iterdir():   # iterate directory contents
            traverse(item, depth + 1, loc)   # recurse contents

root = Path(".")
depth = 0
loc = []
traverse(root, depth, loc)
print(loc)

With this simple solution if you ever want to increase the depth all you have to do is change the MAXDEPTH variable accordingly.

Try this:

import os

def findDirWithFileInLevel(path, file, level=1):
    c = path.count(os.sep)
    for root, dirs, files in os.walk(path):
        for name in files:
            if name == file and root.count(os.sep) - c - 1 <= level:
                yield root
                
for i in findDirWithFileInLevel(".\\root", "readme.txt", 2):
    print(i)

The logic is similar to your os.walk hack, just compare the number of path separator with level in os.walk and return the path ( root in my code).

c is the number of path separator in the initial path.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM