简体   繁体   中英

Segregate files based on filename

I've got a directory containing multiple images, and I need to separate them into two folders based on a portion of the file name. Here's a sample of the file names:

  1. 22DEC16 7603 520981127600_03.jpg
  2. 13NOV16 2302 999230157801_07.jpg
  3. 08JAN14 7603 811108236510_02.jpg
  4. 21OCT15 2302 197661710099_07.jpg
  5. 07MAR17 2302 551529900521_01.jpg
  6. 19FEB17 3211 074174309177_09.jpg
  7. 19FEB17 3211 881209232440_02.jpg
  8. 19FEB17 2302 491000265198_04.jpg

I need to move the files into two folders according to the numbers in bold after the date - so files containing 2302 and 3211 would go into an existing folder named "panchromatic" and files with 7603 would go into another folder named "sepia".

I've tried multiple examples from other questions, and none seem to fit this problem. I'm very new to Python, so I'm not sure what example to post. Any help would be greatly appreciated.

Without giving you the solution, here's what I'd recommend.

  1. Use os.listdir to iterate over files in your directory.

     path = '/path/to/dir/' for file in os.listdir(path): ... 
  2. Check the 4 digits by slicing your string. By the looks of it, you'd need to get file[6:10]

  3. Check if int(file[6:10]) in {2302, 2311} . If yes, dst = /path/to/panchromatic . Else, dst = /path/to/sepia/

  4. Use shutil.move to move files. Something like shutil.move(os.path.join(path, file), dst) , where os.path.join joins path artefacts.

  5. Make sure you import os and import shutil at the top of your script.

You can do this the easy way or the hard way.

Easy way

Test if your filename contains the substring you're looking for.

import os
import shutil
files = os.listdir('.')
for f in files:
    # skip non-jpeg files
    if not f.endswith('.jpg'):
        continue
    # move if panchromatic
    if '2302' in f or '3211' in f:
        shutil.move(f, os.path.join('panchromatic', f))
    # move if sepia
    elif '7603' in f:
        shutil.move(f, os.path.join('sepia', f))
    # notify if something else
    else:
        print('Could not categorize file with name %s' % f)

This solution in its current form is susceptible to mis-classification, as the number we're looking for can appear by chance later in the string. I'll leave you to find ways to mitigate this.

Hard way

Regular expressions. Match the four letter digits after the date with a regular expression. Left for you to explore!

Self explanative, with Python 3, or Python 2 + backport pathlib :

import pathlib
import shutil

# Directory paths. Tailor this to your files layout
# see https://docs.python.org/3/library/pathlib.html#module-pathlib
source_dir = pathlib.Path('.')
sepia_dir = source_dir / 'sepia'
panchro_dir = source_dir / 'panchromatic'

assert sepia_dir.is_dir()
assert panchro_dir.is_dir()

destinations = {
    ('2302', '3211'): panchro_dir,
    ('7603',): sepia_dir
}

for filename in source_dir.glob('*.jpg'):
    marker = str(filename)[7:11]
    for key, value in destinations.items():
        if marker in key:
            filepath = source_dir / filename
            shutil.move(str(filepath), str(value))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM