简体   繁体   中英

Creating subdirectories and sorting files based on filename PYTHON

I have a large directory with many part files with their revisions, I want to recursively create a new folder for each part, and then move all of the related files into that folder. I am trying to do this by isolating a 7 digit number which would be used as an identifier for the part, and all the related filenames would also include this number.

import os
import shutil
import csv
import glob
from fnmatch import fnmatch, filter
from os.path import isdir, join
from shutil import copytree, copy2, Error, copystat
from shutil import copytree, ignore_patterns


dirname = ' '

# pattern =  '*???????*'

for root, dirs, files in os.walk(dirname):
    for fpath in files:
        print(fpath)
        if fpath[0:6].isdigit():
            matchdir = os.mkdir(os.path.join(os.path.dirname(fpath)))
            partnum = str(fpath[0:6])
            pattern = str(partnum)
            filematch = fnmatch(files, pattern)
            print(filematch)
            shutil.move(filematch, matchdir)

This is what I have so far, basically I'm not sure how to get the original filename and use it as the matching patter for the rest of the files. The original filename I want to use for this matching pattern is just a 7 digit number, and all of the related files may have other characters (REV-2) for example.

Don't overthink it

I think you're getting confused about what os.walk() gives you - recheck the docs . dirs and files are just a list of names of the directories / files, not the full paths.

Here's my suggestion. Assuming that you're starting with a directory layout something like:

directory1
    1234567abc.txt
1234567abc.txt
1234567bcd.txt
2234567abc.txt
not-interesting.txt

And want to end with something like:

directory1
    1234567
        abc.txt
1234567
    abc.txt
    bcd.txt
2234567
    abc.txt
not-interesting.txt

If that's correct, then there's no need to rematch the files in the directory, just operate on each file individually, and make the part directory only if it doesn't already exist. I would also use a regular expression to do this, so something like:

import os
import re
import shutil

for root, dirs, files in os.walk(dirname):
    for fname in files:
        # Match a string starting with 7 digits followed by everything else.
        # Capture each part in a group so we can access them later.
        match_object = re.match('([0-9]{7})(.*)$', fname)
        if match_object is None:
            # The regular expression did not match, ignore the file.
            continue

        # Form the new directory path using the number from the regular expression and the current root.
        new_dir = os.path.join(root, match_object.group(1))
        if not os.path.isdir(new_dir):
            os.mkdir(new_dir)

        new_file_path = os.path.join(new_dir, match_object.group(2))

        # Or, if you don't want to change the filename, use:
        new_file_path = os.path.join(new_dir, fname)

        old_file_path = os.path.join(root, fname)
        shutil.move(old_file_path, new_file_path)

Note that I have:

  • Switched the sense of the condition, we continue the loop immediately if the file is not interesting. This is a useful pattern to use to make sure that your code does not get too heavily indented.
  • Changed the name of fpath to fname . This is because it's not a path but just the name of the file, so it's better to call it fname .

Please clarify the question if that's not what you meant!

[edit] to show how to copy the file without changing its name.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM