如何使用python读取特定文件夹中的大量txt文件

Question

Please help me, i have some file txt in folder. 请帮助我，我在文件夹中有一些文件txt。 I want to read and summary all data become one file txt. 我想阅读并汇总所有数据，成为一个文件txt。 How can I do it with python. 如何使用python做到这一点。 for example : 例如：

folder name : data
file name in that folder : log1.txt
                           log2.txt
                           log3.txt
                           log4.txt
data in log1.txt : Size:         1,116,116,306 bytes
data in log2.txt : Size:         1,116,116,806 bytes
data in log3.txt : Size:         1,457,116,806 bytes
data in log4.txt : Size:         1,457,345,000 bytes

My expected output: 我的预期输出：

   a file txt the result.txt and the data is : 1,116,116,306 
                                               1,116,116,806 
                                               1,457,116,806 
                                               1,457,345,000

Answer 1

Did you mean you want to read the contents of each file and write all of them in to a different file. 您是说要读取每个文件的内容并将它们全部写入另一个文件中吗？

import os
#returns the names of the files in the directory data as a list
list_of_files = os.listdir("data")
lines=[]
for file in list_of_files:
    f = open(file, "r")
    #append each line in the file to a list
    lines.append(f.readlines())
    f.close()

#write the files to result.txt
result = open("result.txt", "w")
result.writelines(lines)
result.close()

If you are looking for size of file instead of the contents. 如果要查找文件大小而不是内容。 change the two lines : 更改两行：

 f= open(file,"r")
lines.append(f.readlines())

to: 至：

lines.append(os.stat(file).st_size)

Answer 2

File concat.py 文件concat.py

#!/usr/bin/env python
import sys, os

def main():
    folder = sys.argv[1] # argument contains path
    with open('result.txt', 'w') as result: # result file will be in current working directory
        for path in os.walk(folder).next()[2]: # list all files in provided path
            with open(os.path.join(folder, path), 'r') as source:
                result.write(source.read()) # write to result eachi file

main()

Usage concat.py <your path> 用法concat.py <your path>

Answer 3

You have to find all files that you are going to read: 您必须找到要阅读的所有文件：
```
 path = "data" files = os.listdir(path) 
```
You have to read all files and for each of them to collect the size and the content: 您必须阅读所有文件，并为每个文件收集大小和内容：
```
 all_sz = {i:os.path.getsize(path+'/'+i) for i in files} all_data = ''.join([open(path+'/'+i).read() for i in files]) 
```

You need a formatted print: 您需要格式化的打印件：

 msg = 'this is ...;' sp2 = ' '*4 sp = ' '*len(msg) + sp2 print msg + sp2, for i in all_sz: print sp, "{:,}".format(all_sz[i])

Answer 4

Import os . 导入os 。 Then list the folder contents using os.listdir('data') and store it in an array. 然后使用os.listdir('data')列出文件夹内容，并将其存储在数组中。 For each entry you can get the size by calling os.stat(entry).st_size . 对于每个条目，您都可以通过调用os.stat(entry).st_size来获取大小。 Each of these entries can now be written to a file. 现在，每个条目都可以写入文件。

Combined: 合并：

import os

outfile = open('result.txt', 'w')
path = 'data'
files = os.listdir(path)
for file in files:
    outfile.write(str(os.stat(path + "/" + file).st_size) + '\n')

outfile.close()

Answer 5

If one needs to merge sorted files so that the output file is sorted too, they can use the merge method from the heapq standard library module. 如果需要合并排序的文件以便对输出文件也进行排序，则他们可以使用heapq标准库模块中的merge方法。

from heapq import merge
from os import listdir

files = [open(f) for f in listdir(path)]
with open(outfile, 'w') as out:
    for rec in merge(*files):
        out.write(rec)

Records are kept sorted in lexical order, if one needs something different merge accepts a key=... optional argument to specify a different ordering function. 如果记录需要以其他顺序进行merge ，则记录将按词汇顺序进行排序。 merge接受key=...可选参数来指定其他排序功能。

如何使用python读取特定文件夹中的大量txt文件

问题描述

5 个解决方案

解决方案1
2 已采纳 2016-10-27 08:37:09

解决方案2
1 2016-10-27 08:59:40

解决方案3
0 2016-10-27 08:49:27

解决方案4
0 2016-10-27 08:54:48

解决方案5
0 2016-10-28 15:14:42

如何使用python读取特定文件夹中的大量txt文件

问题描述

5 个解决方案

解决方案1 2 已采纳 2016-10-27 08:37:09

解决方案2 1 2016-10-27 08:59:40

解决方案3 0 2016-10-27 08:49:27

解决方案4 0 2016-10-27 08:54:48

解决方案5 0 2016-10-28 15:14:42

解决方案1
2 已采纳 2016-10-27 08:37:09

解决方案2
1 2016-10-27 08:59:40

解决方案3
0 2016-10-27 08:49:27

解决方案4
0 2016-10-27 08:54:48

解决方案5
0 2016-10-28 15:14:42