简体   繁体   中英

How to simply walk through directories and subdirectories and create archive if found certain files

I would like to create 2 scripts. First would be responsible for traversing through all subdirectories in parent folder, looking for files with extension "*.mp4", "*.txt","*.jpg" and if folder (for example testfolder ) with such three files is found, another scripts performs operation of creating archive testfolder.tar .

Here is my directory tree for testing those scripts: https://imgur.com/4cX5t5N

rootDirectory contains parentDirectory1 and parentDirectory2 . parentDirectories contain childDirectories .

Here is code of dirScanner.py trying to print extensions of files in subdirs:

import os

rootdir = r'C:\Users\user\pythonprogram\rootDirectory'
for directory in os.walk(rootdir):
    for subdirectory in directory:
        extensions = []
        if os.path.isfile(os.curdir):
            extensions.append(os.path.splitext(os.curdir)[-1].lower())
        print(extensions)

However it absolutely does not work as I expect it to work. How should I traverse through parentDirectories and childDirectiories in rootDirectory ?

I would like to keep it simple, in the way "Okay I'm in this directory, the files of this directory are XXX, Should/Shouldn't pack them"

Also, this is my other script that should be responsible for packing files for specified path. I'm trying to learn how to use classes however I don't know if I understand it correctly.

import tarfile

class folderNeededToBePacked:
    def __init__(self, name, path):
        self.path = path
        self.name = name
    def pack(self):
        tar = tarfile.open(r"{0}/{1}.tar".format(self.path, self.name), "w")
        for file in self.path:
            tar.add(file)
        tar.close()

I'd be thankful for all tips and advices how to achieve the goal of this task.

It's a simple straight forward task without many complex concepts which would call for being implemented as a class, so I would not use one for this.

The idea is to walk through all directories (recursively) and if a matching directory is found, pack the three files of this directory into the archive.

To walk through the directory tree you need to fix your usage of 'os.walk()' according to its documentation:

tar = tarfile.open(...)
for dirpath, dirnames, filenames in os.walk(root):
  found_files = dir_matching(root, dirpath)
  for found_file in found_files:
    tar.add(found_file)
tar.close()

And the function dir_matching() should return a list of the three found files (or an empty list if the directory doesn't match, ie at least one of the three necessary files is missing):

def dir_matching(root, dirpath):
  jpg = glob.glob(os.path.join(root, dirpath, '*.jpg')
  mp4 = glob.glob(os.path.join(root, dirpath, '*.mp4')
  txt = glob.glob(os.path.join(root, dirpath, '*.txt')
  if jpg and mp4 and txt:
    return [ jpg[0], mp4[0], txt[0] ]
  else:
    return []

Of course you could add more sophisticated checks eg whether exactly one jpg etc. is found, but that depends on your concrete specifications.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM