When editing the contents of a file I have been using the approach of:
.read()
method and assign to another variable For example:
fo = open('file.html', r)
fo_as_string = fo.read()
fo.close()
# # #
# do stuff to fo_as_string here
# # #
fo = open('file.html', w)
fo.write(fo_as_string)
fo.close()
I now find myself in the situation however where I need to remove any white space at the beginning of lines and I think as I have converted the file object to a string there is no way to target this whitespace, at a 'line' level, with string methods like lstrip and rstrip.
So I guess I am after logic advice on how to retain the flexibility of having the file contents as a string for manipulation, but also be able to target lines within the string for specific line manipulation when required, as in the example above.
Use a for-loop
, a for-loop over a file object returns one line at a time.
#use `with` statement for handling files, it automatically closes the file for you.
with open('file.html') as fo, open('file1.html', 'w') as fo1:
for line in fo: #reads one line at a time, memory efficient
#do something with line, line.strip()
fo1.write(line + '\n') #write line to to fo1
If you're trying to modify the same file then use fileinput
module:
import fileinput
for line in fileinput.input('file.html', inplace = True):
#do something with line
print line #writes the line back to 'file.html'
You can also get individual lines from file.read()
as well, split it using:
fo_as_string = fo.read()
lines = fo_as_string.splitlines()
But file.read()
loads the whole file into memory, so it is not much memory efficient.
Other alternatives are f.readlines()
and list(f)
, both return a list of all lines from the file object.
Depending on the size of the file, and the processes you want to do to each line, there are a couple of answers that might work for you.
First, if you're intent on keeping the entire file in memory while you process it, you could save it as a list of lines, process some or all of the lines, and rejoin them with your standard line delimiter when you wish to write them to disk:
linesep = '\n'
with open('file.html', 'r') as fin:
input_lines = fin.readlines()
# Do your per-line transformation
modified_lines = [line.lstrip() for line in input_lines]
# Join the lines into one string to do whole-string processing
whole_string = linesep.join(modified_lines)
# whatever full-string processing you're looking for, do here
# Write to disk
with open('file1.html', 'w') as output_file:
output_file.write(whole_string)
Or you could specify your own line separator, and do the input parsing by hand:
linesep = '\n'
input_lines_by_hand = fin.read.split(linesep)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.