简体   繁体   English

Python:使用Glob查找特定的子文件夹

[英]Python: Using Glob to find specific sub-folder

I am looking to recursively search through a folder containing many sub-folders. 我想递归搜索包含许多子文件夹的文件夹。 Some sub-folders contain a specific folder that I want to loop through. 一些子文件夹包含一个我想循环浏览的特定文件夹。

I am familiar with the glob.glob method to find specific files: 我熟悉glob.glob方法来查找特定文件:

import glob, os
from os import listdir
from os.path import isfile, join

os.chdir(pathname) #change directory to path of choice
files = [f for f in glob.glob("filename.filetype") if isfile(join(idir, f))]

Some sub folders in a directory have a time stamp (YYYYMMDD) as their names all containing identical file names. 目录中的某些子文件夹有一个时间戳(YYYYMMDD),因为它们的名称都包含相同的文件名。 Some of those sub folders contain folders within them with a name, let's call it "A". 这些子文件夹中有一些包含名称的文件夹,我们称其为“ A”。 I'm hoping to create a code that will recursively search for the folder within called "A" within these "specific sub folders". 我希望创建一个代码来递归地搜索这些“特定子文件夹”中名为“ A”的文件夹。 Is there a way to use glob.glob to find these specific sub-folders within a directory? 有没有一种方法可以使用glob.glob在目录中查找这些特定的子文件夹?

I am aware of a similar question: How can I search sub-folders using glob.glob module in Python? 我知道一个类似的问题: 如何在Python中使用glob.glob模块搜索子文件夹?

but this person seems to be looking for specific files, whereas I am looking for pathnames. 但是此人似乎正在寻找特定文件,而我正在寻找路径名。

You can use os.walk which will walk the tree. 您可以使用os.walk来走树。 Each iteration shows you the directory and its immediate subdirectories, so the test is simple. 每次迭代都会向您显示目录及其直接子目录,因此测试很简单。

import os
import re

# regular expression to match YYYYMMDD timestamps (but not embedded in
# other numbers like 2201703011).
timestamp_check = re.compile(re.compile(r"[^\d]?[12]\d3[01]\d[0123]\d")).search

# Option 1: Stop searching a subtree if pattern is found
A_list = []
for root, dirs, files in os.walk(pathname):
    if timestamp_check(os.path.basename(root)) and 'A' in dirs:
        A_list.append(os.path.join(root, A))
        # inplace modification of `dirs` trims subtree search
        del dirs[:]

# Option 2: Search entire tree, even if matches found
A_list = [os.path.join(root, 'A') 
    for root, dirs, files in os.walk(pathname) 
    if timestamp_check(os.path.basename(root)) and 'A' in dirs]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM