使用正则表达式从 os.walk 给出的文件名中提取子字符串

Question

I'm basically trying to grab 3 pieces of information from this os.walk我基本上是想从这个 os.walk 中获取 3 条信息

Is there a folder with the name unit in it?里面有没有名字单位的文件夹？ If so, I want to know the contents of the folder.如果是这样，我想知道文件夹的内容。
Within those contents, is there a folder name with the format: \d\d\d\d\d\d_DAY\d\d ?在这些内容中，是否有格式为： \d\d\d\d\d\d_DAY\d\d的文件夹名称？ If so, I want to extract the first (\d\d\d\d\d\d) and save it as date .如果是这样，我想提取第一个(\d\d\d\d\d\d)并将其保存为date 。
Further within that folder tree, are there MXF files?在该文件夹树的更深处，是否有 MXF 文件？ If so, move the contents of the previous folder to: 'Users/davealterman/Desktop/Volumes/HOW_TO_OCM/RAID OCM/FS4/' + 'DATE'如果是这样，请将上一个文件夹的内容移动到： 'Users/davealterman/Desktop/Volumes/HOW_TO_OCM/RAID OCM/FS4/' + 'DATE'

I am new to coding and this has been a headache.我是编码新手，这让我很头疼。 Any help would be appreciated, I know this code doesn't make sense but I'm a bit frustrated任何帮助将不胜感激，我知道这段代码没有意义，但我有点沮丧


import os, glob, re, shutil 
from pathlib import Path

FS5_path = 'Users/davealterman/Desktop/Volumes/HOW_TO_OCM/RAID OCM/FS4'

home_path = '/Users/davealterman/Desktop/Volumes/HOW_TO_OCM/_FROM PRODUCTION'

os.chdir(home_path)

subList = []
i = -1
for dirs, subs, files in os.walk(home_path):

    for sub in subs:
        print(sub)
        subList.append(sub)
        i + 1
        formatRegex = re.compile(r'(\d{6})(_DAY)(\d{2})')
        mo = formatRegex.search(sub)
        mo.group()

Answer 1

Give this one a shot.试一试。

import os, glob, re, shutil 
from pathlib import Path

FS5_path = 'Users/davealterman/Desktop/Volumes/HOW_TO_OCM/RAID OCM/FS4'

home_path = '/Users/davealterman/Desktop/Volumes/HOW_TO_OCM/_FROM PRODUCTION'

os.chdir(home_path)

subList = []
i = -1
for dirs, subs, files in os.walk(home_path):
    # Is there a folder with the name unit in it? If so, I want to know the contents of the folder.
    
    # filter folders containing `unit`
    searching_for = 'unit'
    matched_folders = filter(lambda folder_name: searching_for in folder_name, subs)    
    for folder in matched_folders:
        print(
            os.listdir(
                os.path.join(home_path, folder)
            )
        )
    
    # Within those contents, is there a folder name with the format: \d\d\d\d\d\d_DAY\d\d? If so, I want to extract the first (\d\d\d\d\d\d) and save it as date.
    date_regex = re.compile(r'(\d{5})_DAY\d{2}')

    folders_matching_regex = filter(lambda file: date_regex.fullmatch(file), subs)
    dates = [date_regex.match(folder)[0] for folder in folders_matching_regex]
    date = dates[0]
    mxf_regex = re.compile(r'.*\.pdf')
    mxf_files = filter(lambda file: mxf_regex.fullmatch(file), files)
    for file in mxf_files:
        dest_dir = FS5_path + date + file
        shutil.move(file, dest_dir)

使用正则表达式从 os.walk 给出的文件名中提取子字符串

问题描述

1 个解决方案

解决方案1
0 2021-06-02 15:05:16

使用正则表达式从 os.walk 给出的文件名中提取子字符串

问题描述

1 个解决方案

解决方案1 0 2021-06-02 15:05:16

解决方案1
0 2021-06-02 15:05:16