[英]How to read a lot of txt file in specific folder using python
请帮助我,我在文件夹中有一些文件txt。 我想阅读并汇总所有数据,成为一个文件txt。 如何使用python做到这一点。 例如 :
folder name : data
file name in that folder : log1.txt
log2.txt
log3.txt
log4.txt
data in log1.txt : Size: 1,116,116,306 bytes
data in log2.txt : Size: 1,116,116,806 bytes
data in log3.txt : Size: 1,457,116,806 bytes
data in log4.txt : Size: 1,457,345,000 bytes
我的预期输出:
a file txt the result.txt and the data is : 1,116,116,306
1,116,116,806
1,457,116,806
1,457,345,000
您是说要读取每个文件的内容并将它们全部写入另一个文件中吗?
import os
#returns the names of the files in the directory data as a list
list_of_files = os.listdir("data")
lines=[]
for file in list_of_files:
f = open(file, "r")
#append each line in the file to a list
lines.append(f.readlines())
f.close()
#write the files to result.txt
result = open("result.txt", "w")
result.writelines(lines)
result.close()
如果要查找文件大小而不是内容。 更改两行:
f= open(file,"r")
lines.append(f.readlines())
至:
lines.append(os.stat(file).st_size)
文件concat.py
#!/usr/bin/env python
import sys, os
def main():
folder = sys.argv[1] # argument contains path
with open('result.txt', 'w') as result: # result file will be in current working directory
for path in os.walk(folder).next()[2]: # list all files in provided path
with open(os.path.join(folder, path), 'r') as source:
result.write(source.read()) # write to result eachi file
main()
用法concat.py <your path>
您必须找到要阅读的所有文件:
path = "data" files = os.listdir(path)
您必须阅读所有文件,并为每个文件收集大小和内容:
all_sz = {i:os.path.getsize(path+'/'+i) for i in files} all_data = ''.join([open(path+'/'+i).read() for i in files])
您需要格式化的打印件:
msg = 'this is ...;' sp2 = ' '*4 sp = ' '*len(msg) + sp2 print msg + sp2, for i in all_sz: print sp, "{:,}".format(all_sz[i])
导入os
。 然后使用os.listdir('data')
列出文件夹内容,并将其存储在数组中。 对于每个条目,您都可以通过调用os.stat(entry).st_size
来获取大小。 现在,每个条目都可以写入文件。
合并:
import os
outfile = open('result.txt', 'w')
path = 'data'
files = os.listdir(path)
for file in files:
outfile.write(str(os.stat(path + "/" + file).st_size) + '\n')
outfile.close()
如果需要合并排序的文件以便对输出文件也进行排序,则他们可以使用heapq
标准库模块中的merge
方法。
from heapq import merge
from os import listdir
files = [open(f) for f in listdir(path)]
with open(outfile, 'w') as out:
for rec in merge(*files):
out.write(rec)
如果记录需要以其他顺序进行merge
,则记录将按词汇顺序进行排序。 merge
接受key=...
可选参数来指定其他排序功能。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.