[英]Read in multiple folder and combine multiple text files contents to one file per folder - Python
I'm new to Python.我是 Python 的新手。 I have 100's of multiple folders in the same Directory inside each folder i have multiple text files each.我在每个文件夹内的同一目录中有 100 个多个文件夹,每个文件夹都有多个文本文件。 i want to combine all text files contents to one per folder.我想将所有文本文件内容合并到每个文件夹中。
Folder1
text1.txt
text2.txt
text3.txt
.
.
Folder2
text1.txt
text2.txt
text3.txt
.
.
i need output as copy all text files content in to one text1.txt + text2.txt + text3.txt ---> Folder1.txt我需要 output 将所有文本文件内容复制到一个text1.txt + text2.txt + text3.txt ---> Folder1.txt
Folder1
text1.txt
text2.txt
text3.txt
Folder1.txt
Folder2
text1.txt
text2.txt
text3.txt
Folder2.txt
i have below code which just list out the text files.我有下面的代码,它只是列出了文本文件。
for path,subdirs, files in os.walk('./data')
for filename in files:
if filename.endswith('.txt'):
please help me how to proceed on the task.请帮助我如何继续执行任务。 Thank you.谢谢你。
Breaking down the problem we need the solution to:分解问题,我们需要解决方案:
And then apply this solution to every sub directory in the base directory.然后将此解决方案应用于基目录中的每个子目录。 Tested the code below.测试了下面的代码。
Assumption : the subfolders have only text files and no directories假设:子文件夹只有文本文件,没有目录
import os
# Function to merge all files in a folder
def merge_files(folder_path):
# get all files in the folder,
# assumption: folder has no directories and all text files
files = os.listdir(folder_path)
# form the file name for the new file to create
new_file_name = os.path.basename(folder_path) + '.txt'
new_file_path = os.path.join(folder_path, new_file_name)
# open new file in write mode
with open(new_file_path, 'w') as nf:
# open files to merge in read mode
for file in files:
file = os.path.join(folder_path, file)
with open(file, 'r') as f:
# read all lines of a file and write into new file
lines_in_file = f.readlines()
nf.writelines(lines_in_file)
# insert a newline after reading each file
nf.write("\n")
# Call function from the main folder with the subfolders
folders = os.listdir("./test")
for folder in folders:
if os.path.isdir(os.path.join('test', folder)):
merge_files(os.path.join('test', folder))
First you will need to get all folder names, which can be done with os.listdir(path_to_dir)
.首先,您需要获取所有文件夹名称,这可以通过os.listdir(path_to_dir)
来完成。 Then you iterate over all of them, and for each you will need to iterate over all of its children using the same function, while concatenating contents using this: https://stackoverflow.com/a/13613375/13300960然后你遍历所有这些,对于每个你需要使用相同的 function 遍历它的所有孩子,同时使用这个连接内容: https://stackoverflow.com/a/13613375/13300960
Try writing it by yourself and update the answer with your code if you will need more help.如果您需要更多帮助,请尝试自己编写并使用您的代码更新答案。
Edit: os.walk
might not be the best solution since you know your folder structure and just two listdir
s will do the job.编辑: os.walk
可能不是最好的解决方案,因为您知道您的文件夹结构并且只有两个listdir
s 可以完成这项工作。
import os
basepath = '/path/to/directory' # maybe just '.'
for dir_name in os.listdir(basepath):
dir_path = os.path.join(basepath, dir_name)
if not os.path.isdir(dir_path):
continue
with open(os.path.join(dir_path, dir_name+'.txt') , 'w') as outfile:
for file_name in os.listdir(dir_path):
if not file_name.endswith('.txt'):
continue
file_path = os.path.join(dir_path, file_name)
with open(file_path) as infile:
for line in infile:
outfile.write(line)
This is not the best code, but it should get the job done and it is the shortest.这不是最好的代码,但它应该可以完成工作并且它是最短的。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.