简体   繁体   中英

Is there a Python function to open/extract files from a series of folders in the same directory?

At the moment a have a ton of HTML files in a really inconvenient place. There is one file per folder and 2000+ folders. What I want to do is write a python script to open the folders and/or extract their contents and append it all into one file. Opening a single file is no problem, but how to open many individual files each from their own individual folders?

You could iterate through the folder with os.listdir and then iterate through the files inside those foldes:

import os
for object in os.listdir(my_dir):
   if os.path.isdir(object):
      for file in os.listdir(object):
         if not os.path.isdir(file):
             do_something(file)
   

I have written a small code that will help you to get all html files(in same directory and sub-directories) and write it into a new html file.

The function os.walk() generates the file names in a directory tree by walking the tree either top-down or bottom-up. Typically getting all the files and subfolders in root directory
Just edit the path of dirName and name of datafile in the code below

# Import Module
import os
  
# Folder Path
dirName = "D:\\New folder" #specify your root directory name
datafile = open('FileName.html','a+') #specify your file name in which you want to append the data
   
# Get the list of all files in directory tree at given path
listOfFiles = list()
for (dirpath, dirnames, filenames) in os.walk(dirName):
    listOfFiles += [os.path.join(dirpath, file) for file in filenames]
        
        
# Read text File
def read_text_file(file_path):
    with open(file_path, 'r') as f:        
        datafile.write(f.read())
        datafile.write('\n\n')
  
  
# iterate through all file
for file in listOfFiles:
    # Check whether file is in html format or not
    if file.endswith(".html"):          
        # call read html file function
        read_text_file(file)
        
datafile.close()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM