So I have 7000+ txt files that look something like this:
1 0.51 0.73 0.81
0 0.24 0.31 0.18
2 0.71 0.47 0.96
1 0.15 0.25 0.48
And as output I want:
0 0.24 0.31 0.18
2 0.71 0.47 0.96
I wrote the code combining multiple sources and it looks like this:
#!/usr/bin/env python3
2 import glob
3 import os
4 import pathlib
5 import re
6 path = './*.txt'
7
8 for filename in glob.glob(path):
9 with open(filename, 'r') as f:
10 for line in f.readlines():
13 if not (line.startswith('1')):
14 print(line)
15 out = open(filename, 'w')
16 out.write(line)
17 f.close()
But the output for the upper example is:
2 0.71 0.47 0.96
How can I fix the code to give me the correct output?
This is because you overwrite the output in the for-loop. You can either save to a different file:
path = 'test.txt'
output = 'out.txt'
for filename in glob.glob(path):
with open(filename, 'r') as f:
out = open(outfile, 'w')
for line in f.readlines():
if not (line.startswith('1')):
print(line)
out.write(line)
f.close()
or you can use append to make an array and then write that to the same file:
import glob
import os
import pathlib
import re
path = 'test.txt'
output = []
for filename in glob.glob(path):
with open(filename, 'r') as f:
for line in f.readlines():
if not (line.startswith('1')):
print(line)
output.append(line)
with open(path, 'w') as w:
for line in output:
print(line)
w.write(line)
f.close()
The problem is that you're re-initializing the output file on every row. This can be fixed by opening the output file earlier and using it for every line.
#!/usr/bin/env python3
from glob import glob
import os
import pathlib
import re
for filename in glob('./*.txt'):
with open(filename,'r') as original_file:
original_lines=original_file.readlines()
with open(filename,'w') as updated_file:
updated_file.writelines(
line
for line in original_lines
if not line.startswith('1')
)
The error is here:
open(filename, 'w')
This will overwrite on every iteration of the loop, so you only get the last entry.
open(filename, 'a')
This will a
ppend the content. But better is to open the out file only once, outside of the loop.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.