如何使用python读取特定文件夹中的大量txt文件

Question

请帮助我，我在文件夹中有一些文件txt。 我想阅读并汇总所有数据，成为一个文件txt。 如何使用python做到这一点。 例如：

folder name : data
file name in that folder : log1.txt
                           log2.txt
                           log3.txt
                           log4.txt
data in log1.txt : Size:         1,116,116,306 bytes
data in log2.txt : Size:         1,116,116,806 bytes
data in log3.txt : Size:         1,457,116,806 bytes
data in log4.txt : Size:         1,457,345,000 bytes

我的预期输出：

   a file txt the result.txt and the data is : 1,116,116,306 
                                               1,116,116,806 
                                               1,457,116,806 
                                               1,457,345,000

Answer 1

您是说要读取每个文件的内容并将它们全部写入另一个文件中吗？

import os
#returns the names of the files in the directory data as a list
list_of_files = os.listdir("data")
lines=[]
for file in list_of_files:
    f = open(file, "r")
    #append each line in the file to a list
    lines.append(f.readlines())
    f.close()

#write the files to result.txt
result = open("result.txt", "w")
result.writelines(lines)
result.close()

如果要查找文件大小而不是内容。 更改两行：

 f= open(file,"r")
lines.append(f.readlines())

至：

lines.append(os.stat(file).st_size)

Answer 2

文件concat.py

#!/usr/bin/env python
import sys, os

def main():
    folder = sys.argv[1] # argument contains path
    with open('result.txt', 'w') as result: # result file will be in current working directory
        for path in os.walk(folder).next()[2]: # list all files in provided path
            with open(os.path.join(folder, path), 'r') as source:
                result.write(source.read()) # write to result eachi file

main()

用法concat.py <your path>

Answer 3

您必须找到要阅读的所有文件：

 path = "data" files = os.listdir(path)

您必须阅读所有文件，并为每个文件收集大小和内容：

 all_sz = {i:os.path.getsize(path+'/'+i) for i in files} all_data = ''.join([open(path+'/'+i).read() for i in files])

您需要格式化的打印件：

 msg = 'this is ...;' sp2 = ' '*4 sp = ' '*len(msg) + sp2 print msg + sp2, for i in all_sz: print sp, "{:,}".format(all_sz[i])

Answer 4

导入os 。 然后使用os.listdir('data')列出文件夹内容，并将其存储在数组中。 对于每个条目，您都可以通过调用os.stat(entry).st_size来获取大小。 现在，每个条目都可以写入文件。

合并：

import os

outfile = open('result.txt', 'w')
path = 'data'
files = os.listdir(path)
for file in files:
    outfile.write(str(os.stat(path + "/" + file).st_size) + '\n')

outfile.close()

Answer 5

如果需要合并排序的文件以便对输出文件也进行排序，则他们可以使用heapq标准库模块中的merge方法。

from heapq import merge
from os import listdir

files = [open(f) for f in listdir(path)]
with open(outfile, 'w') as out:
    for rec in merge(*files):
        out.write(rec)

如果记录需要以其他顺序进行merge ，则记录将按词汇顺序进行排序。 merge接受key=...可选参数来指定其他排序功能。

如何使用python读取特定文件夹中的大量txt文件

问题描述

5 个解决方案

解决方案1
2 已采纳 2016-10-27 08:37:09

解决方案2
1 2016-10-27 08:59:40

解决方案3
0 2016-10-27 08:49:27

解决方案4
0 2016-10-27 08:54:48

解决方案5
0 2016-10-28 15:14:42

如何使用python读取特定文件夹中的大量txt文件

问题描述

5 个解决方案

解决方案1 2 已采纳 2016-10-27 08:37:09

解决方案2 1 2016-10-27 08:59:40

解决方案3 0 2016-10-27 08:49:27

解决方案4 0 2016-10-27 08:54:48

解决方案5 0 2016-10-28 15:14:42

解决方案1
2 已采纳 2016-10-27 08:37:09

解决方案2
1 2016-10-27 08:59:40

解决方案3
0 2016-10-27 08:49:27

解决方案4
0 2016-10-27 08:54:48

解决方案5
0 2016-10-28 15:14:42