[英]How to simply walk through directories and subdirectories and create archive if found certain files
I would like to create 2 scripts.我想创建 2 个脚本。 First would be responsible for traversing through all subdirectories in parent folder, looking for files with extension
"*.mp4", "*.txt","*.jpg"
and if folder (for example testfolder
) with such three files is found, another scripts performs operation of creating archive testfolder.tar
.首先负责遍历父文件夹中的所有子目录,查找扩展名为
"*.mp4", "*.txt","*.jpg"
文件,如果找到包含这三个文件的文件夹(例如testfolder
),另一个脚本执行创建存档testfolder.tar
操作。
Here is my directory tree for testing those scripts: https://imgur.com/4cX5t5N这是我用于测试这些脚本的目录树: https : //imgur.com/4cX5t5N
rootDirectory
contains parentDirectory1
and parentDirectory2
. rootDirectory
包含parentDirectory1
和parentDirectory2
。 parentDirectories
contain childDirectories
. parentDirectories
包含childDirectories
。
Here is code of dirScanner.py
trying to print extensions of files in subdirs:这是
dirScanner.py
尝试打印子目录中文件扩展名的代码:
import os
rootdir = r'C:\Users\user\pythonprogram\rootDirectory'
for directory in os.walk(rootdir):
for subdirectory in directory:
extensions = []
if os.path.isfile(os.curdir):
extensions.append(os.path.splitext(os.curdir)[-1].lower())
print(extensions)
However it absolutely does not work as I expect it to work.然而,它绝对不能像我期望的那样工作。 How should I traverse through
parentDirectories
and childDirectiories
in rootDirectory
?我应该如何遍历
rootDirectory
parentDirectories
和childDirectiories
?
I would like to keep it simple, in the way "Okay I'm in this directory, the files of this directory are XXX, Should/Shouldn't pack them"我想保持简单,以“好吧我在这个目录中,这个目录的文件是XXX,应该/不应该打包它们”的方式
Also, this is my other script that should be responsible for packing files for specified path.此外,这是我的另一个脚本,应该负责为指定路径打包文件。 I'm trying to learn how to use classes however I don't know if I understand it correctly.
我正在尝试学习如何使用类,但我不知道我是否理解正确。
import tarfile
class folderNeededToBePacked:
def __init__(self, name, path):
self.path = path
self.name = name
def pack(self):
tar = tarfile.open(r"{0}/{1}.tar".format(self.path, self.name), "w")
for file in self.path:
tar.add(file)
tar.close()
I'd be thankful for all tips and advices how to achieve the goal of this task.我将感谢所有关于如何实现此任务目标的提示和建议。
It's a simple straight forward task without many complex concepts which would call for being implemented as a class, so I would not use one for this.这是一个简单直接的任务,没有很多复杂的概念,需要作为一个类来实现,所以我不会为此使用一个。
The idea is to walk through all directories (recursively) and if a matching directory is found, pack the three files of this directory into the archive.这个想法是遍历所有目录(递归),如果找到匹配的目录,则将该目录的三个文件打包到存档中。
To walk through the directory tree you need to fix your usage of 'os.walk()' according to its documentation:要遍历目录树,您需要根据其文档修复“os.walk()”的用法:
tar = tarfile.open(...)
for dirpath, dirnames, filenames in os.walk(root):
found_files = dir_matching(root, dirpath)
for found_file in found_files:
tar.add(found_file)
tar.close()
And the function dir_matching()
should return a list of the three found files (or an empty list if the directory doesn't match, ie at least one of the three necessary files is missing):并且函数
dir_matching()
应该返回三个找到的文件的列表(如果目录不匹配,则返回一个空列表,即至少缺少三个必要文件之一):
def dir_matching(root, dirpath):
jpg = glob.glob(os.path.join(root, dirpath, '*.jpg')
mp4 = glob.glob(os.path.join(root, dirpath, '*.mp4')
txt = glob.glob(os.path.join(root, dirpath, '*.txt')
if jpg and mp4 and txt:
return [ jpg[0], mp4[0], txt[0] ]
else:
return []
Of course you could add more sophisticated checks eg whether exactly one jpg etc. is found, but that depends on your concrete specifications.当然,您可以添加更复杂的检查,例如是否仅找到一个 jpg 等,但这取决于您的具体规格。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.