简体   繁体   中英

Python: How to save files from folders and subfolders?

I have a main folder: e:\\PLUS , witch contains another 4 subfolders ( A,B,C,D ). Now, my python code saves all the html files from the main folder ( PLUS ) but doesn't save the files from the other 4 subfolders.

Can anyone update my code a little bit, so as to save the files also from the subfolders?

def check_links_for_all_files(directory_name):
    for file in os.listdir(directory_name):
        filename = str(file)
        print(filename)
        
        if filename.endswith(".html"): #or filename.endswith(".php"):
            file_path = os.path.join(directory_name, filename)
            
            check_link(file_path)
        else:
            continue

if __name__ == '__main__':
    check_links_for_all_files("e:\\Plus")

You are iterating over the main directory, but not going into sub-directories. Try using os.path.isdir to handle the sub-directories.

Could do something like this:

def check_links_for_all_files(directory_name):
    for file in os.listdir(directory_name):
        path = os.path.join(directory_name, str(file))
        if os.path.isdir(path):
            check_links_for_all_files(path)
        
        if path.endswith(".html"): #or filename.endswith(".php"):
            check_link(path)
        else:
            continue

Notice this will handle the entire directory tree and not just the first hop into the sub-directories.

You can also use the pathlib module from the python standard library.

import pathlib

def check_links_for_all_files(directory_name):
    directories = [pathlib.Path(directory_name)]
    for directory in directories:
        for file in directory.iterdir():
            if file.is_dir():
                directories.append(file)
                continue
            print(file.name)
            if file.suffix == '.html':
                check_link(file)

os.walk is very efficient to iterate over all files and subfolders, here is an example:

import os
    
def check_links_for_all_files(directory_name):
    for root, dirs, files in os.walk(directory_name):
        for file in files:
            if file.endswith(".html"):  # or file.endswith(".php"):
                file_path = os.path.join(root, file)
                check_link(file_path)
            else:
                continue

if __name__ == '__main__':
    check_links_for_all_files("/Users/hbohra/Downloads/")
import glob
import os

def check_links_for_all_files(directory_name):
  for file_path in glob.glob(
      os.path.join(directory_name, '**', '*.html'),recursive=True):
    check_link(file_path)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM