[英]Is there a way to create a function to load all data files in directory and then output their file name and content?-Python 3
Is there a way to create a function to load all data files in directory and then output their file name and content? 有没有办法创建一个函数来加载目录中的所有数据文件,然后输出他们的文件名和内容?
Input: Get all files in a given directory of mine (wow.txt, testting.txt, etc.) 输入:获取我的给定目录中的所有文件(wow.txt,testting.txt等)
Process: I want to run all the files through a function 进程:我想通过一个函数运行所有文件
Output: I want the output to be the total number of files processed and all the files names and their respective content below it. 输出:我希望输出为处理的文件总数以及它下面的所有文件名及其各自的内容。
For example: 例如:
Total Number of Documents: 6
/home/file/wow.txt
"all of its content"
/home/file/www.txt
"all of its content"
Here is my code: 这是我的代码:
#Import Functions
import glob
# get all the .txt files
files=glob.glob("*.txt")
#Load Data Function
def load_data(files):
"""
Input : path to all .txt files
Purpose: loading all text file
Output : list of documents along with their respective content
"""
documents_list=[]
content=[]
for file in files:
with open(file,"rt",encoding="latin-1") as fin:
print(file)
for line in fin.readlines():
text = line.strip()
documents_list.append(text)
print("Total Number of Documents:",len(documents_list))
content.append( text[0:min(len(text),100)])
return documents_list,content
#Output
load_data(files)
Here is my output: 这是我的输出:
As you can see in the first part of the output, it's showing each file and random number. 正如您在输出的第一部分中看到的那样,它显示了每个文件和随机数。 Instead it should just have the total number of documents (which is 5) 相反它应该只有文件总数(5)
It shows the content of all the files but it doesn't separate them by file. 它显示所有文件的内容,但不会按文件分隔它们。 As you can see by the red line, that shows the end of the first file and below the red line is the start of another. 正如您可以通过红线看到的那样,显示第一个文件的结尾,红色线下方是另一个文件的开头。
Any suggestions? 有什么建议么?
def print_files_in_directory(directory):
files = [f for f in os.listdir(directory) if os.path.isfile(f)]
print(f'Total Number of Documents: {len(files)}')
for f in files:
file_path = os.path.join(directory, f)
print(file_path)
print('\n')
with open(file_path, 'r') as fp:
print(fp.read())
If you want it to include files in subdirectories, you'll either have to manually recurse those subdirectories yourself or use os.walk() 如果您希望它包含子目录中的文件,您必须自己手动递归这些子目录或使用os.walk()
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.