简体   繁体   中英

Concatenating files of different directories to one file (Python)

so I have managed to concatenate every single .txt file of one directory into one file with this code:

import os
import glob

folder_path = "/Users/EnronSpam/enron1/ham"
for filename in glob.glob(os.path.join(folder_path, '*.txt')):
  with open(filename, 'r', encoding="latin-1") as f:
      text = f.read()
      with open('new.txt', 'a') as a:
            a.write(text)

but in my 'EnronSpam' folder there are actually multiple directories (enron 1-6), each of which has a ham directory. How is it possible to go through each directory and add every single file of that directory into one file?

If you just want to collect all the txt files from the enron[1-6]/ham folders try this:

glob.glob("/Users/EnronSpam/enron[1-6]/ham/*.txt")

It will pick up all txt files from the enron[1-6] folders' ham subfolders.

Also a slightly reworked snippet of the original code looks like this:

import glob

glob_path = "/Users/EnronSpam/enron[1-6]/ham/*.txt"
with open("new.txt", "w") as a:
    for filename in glob.glob(glob_path):
        with open(filename, "r", encoding="latin-1") as f:
            a.write(f.read())

Instead of always opening and appending to the new file it makes more sense to open it right at the beginning and write the content of the ham txt files.

So, given that the count and the names of the directories are known, you should just add the full paths in a list and loop execute it all for each element:

import os
import glob

folder_list = ["/Users/EnronSpam/enron1/ham", "/Users/EnronSpam/enron2/ham", "/Users/EnronSpam/enron3/ham"]
for folder in folder_list:
    for filename in glob.glob(os.path.join(folder, '*.txt')):
      with open(filename, 'r', encoding="latin-1") as f:
          text = f.read()
          with open('new.txt', 'a') as a:
                a.write(text)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM