[英]Combine multiple unsorted text files into one sorted file using python
I have a list of text files names as a variable in python.我有一个文本文件名称列表作为 python 中的变量。 I want to create another text file which contains all the lines of the files in the list, and I want this file to be sorted by lines.
我想创建另一个文本文件,其中包含列表中文件的所有行,并且我希望这个文件按行排序。
How can I do this in the most efficient way using python?如何使用 python 以最有效的方式做到这一点?
This is the equivalent of what I want to do in bash:这相当于我想在 bash 中做的事情:
cat file1.txt file2.txt file3.txt | sort -h >> combined_and_sorted.txt
I created simple files with 3 lines each.我创建了每个 3 行的简单文件。
My solution is:我的解决方案是:
with open("file 1.txt", 'r') as f1, open("file 2.txt", 'r') as f2,open("file 3.txt", 'r') as f3:
with open("outfile.txt", 'w') as o:
files = [f1,'b',f2,'b',f3]
for i in files:
if i != 'b':
lines = i.readlines()
else:
lines = '\n'
o.writelines(lines)
The explanation:说明:
All the files to be reading are writting upon a single sentence "with open (file_name,'r')".所有要读取的文件都写在一个句子“with open (file_name,'r')”上。 The 'r' stands for reading.
'r' 代表阅读。
The output file is in the next sentence 'with open("outfile.txt", 'w')' output 文件在下一句'with open("outfile.txt", 'w')'
Then a list was created with the variable names of the files and the letter "b" inbetween them.然后创建了一个列表,其中包含文件的变量名称和它们之间的字母“b”。 The 'b' was placed on purpose, because a matter of format.
'b' 是故意放置的,因为格式问题。
When the "loop for" finds a file (f1,f2,f3) it gets the lines of the files.当“循环 for”找到一个文件 (f1,f2,f3) 时,它会获取文件的行。 Otherwise it gets a 'b' and add a line break.
否则它会得到一个'b'并添加一个换行符。
At end of the process everything is written in outfile.txt, which variable name is 'o', following the order of the files and lines inside the files.在过程结束时,所有内容都写入 outfile.txt,变量名称为“o”,按照文件中的文件和行的顺序。
Without the 'b' the output file would be written like this:如果没有“b”,output 文件将这样写:
file 1 line 1
file 1 line 2
file 1 line 3file 2 line 1
file 2 line 2
file 2 line 3file 3 line 1
file 3 line 2
file 3 line 3
With the code I posted it prints this way:使用我发布的代码,它以这种方式打印:
f = open("outfile.txt",'r')
print(f.read())
file 1 line 1
file 1 line 2
file 1 line 3
file 2 line 1
file 2 line 2
file 2 line 3
file 3 line 1
file 3 line 2
file 3 line 3
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.