简体   繁体   中英

Open file, read it, process, and write back - shortest method in Python

I want to do some basic filtering on a file. Read it, do processing, write it back.

I'm not looking for "golfing", but want the simplest and most elegant method to achieve this. I came up with:

from __future__ import with_statement

filename = "..." # or sys.argv...

with open(filename) as f:
    new_txt = # ...some translation of f.read() 

open(filename, 'w').write(new_txt)

The with statement makes things shorter since I don't have to explicitly open and close the file.

Any other ideas ?

Actually an easier way using fileinput is to use the inplace parameter:

import fileinput
for line in fileinput.input (filenameToProcess, inplace=1):
    process (line)

If you use the inplace parameter it will redirect stdout to your file, so that if you do a print it will write back to your file.

This example adds line numbers to your file:

import fileinput

for line in fileinput.input ("b.txt",inplace=1):
    print "%d: %s" % (fileinput.lineno(),line),

I would go for elegance a different way: implement your file-reading and filtering operations as generators, You'll write more lines of code, but it will be more flexible, maintainable, and performant code.

See David M. Beazley's Generator Tricks for Systems Programmers , which is a really important thing for anyone who's writing this kind of code to read.

This seems to work:

with open(filename, "r+") as f:
    new_txt = process(f.read())
    f.truncate(0)
    f.write(new_txt)

If you're looking for the python equivalent of "perl -pi", here's a pretty good one:

import fileinput
for line in fileinput.input():
   # process line

See http://www.python.org/doc/2.5.2/lib/module-fileinput.html for more.

Done this way, you would use your python script in a pipe to create the new file:

$ myscript.py infile.txt > outfile.txt

To do it in a way which won't eat your data if you crash in the middle:

from twisted.python.filepath import FilePath
p = FilePath(filename)
p.setContent(process(p.getContent()))

My ugly (but short as stated in the question) solution with generator expressions ;

# Some setup first
file('test.txt', 'w').write('\n'.join('%05d' % i for i in range(100)))


# This is the filter function
def f(i):
    return i % 3


# This is the main part 
file('test2.txt', 'w').write('\n'.join(str(f(int(l))) for l in file('test.txt', 'r').readlines()))


# And a wrapper for sanity
def filter_file(infile, outfile, filter_function)
    outfile.write('\n'.join(filter_function(l) for l in infile.readlines()))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM