I'm trying to open a text file and then read through it replacing certain strings with strings stored in a dictionary. Based on answers to Replacing words in text file using a dictionary and How to search and replace text in a file using Python?
As like:
# edit print line to print (line)
import fileinput
text = "sample file.txt"
fields = {"pattern 1": "replacement text 1", "pattern 2": "replacement text 2"}
for line in fileinput.input(text, inplace=True):
line = line.rstrip()
for field in fields:
if field in line:
line = line.replace(field, fields[field])
print (line)
My file is encoding in utf-8
.
When I run this, the console shows this error:
UnicodeDecodeError: 'charmap' codec can't decode byte X in position Y: character maps to <undefined>
When add: encoding = "utf8"
to fileinput.FileInput()
its show an error:
TypeError: __init__() got an unexpected keyword argument 'encoding'
When add: openhook=fileinput.hook_encoded("utf8")
to fileinput.FileInput()
it show error:
ValueError: FileInput cannot use an opening hook in inplace mode
I do not want to insert a subcode 'ignore'
ignoring errors.
I have file, dictionary and want replace values from dictionary into file like stdout
.
Source file in utf-8
:
Plain text on the line in the file.
This is a greeting to the world.
Hello world!
Here's another plain text.
And here too!
I want to replace the word world
with the word earth
.
In dictionary: {"world": "earth"}
Modified file in utf-8
:
Plain text on the line in the file.
This is a greeting to the earth.
Hello earth!
Here's another plain text.
And here too!
The fileinput
library has several problems that I addressed in the past in a blog post ; one of these is that you can't set the encoding and use in-place file rewriting.
The following code can do this, but you have to replace your print()
calls with writes to the outgoing file object:
from contextlib import contextmanager
import io
import os
@contextmanager
def inplace(filename, mode='r', buffering=-1, encoding=None, errors=None,
newline=None, backup_extension=None):
"""Allow for a file to be replaced with new content.
yields a tuple of (readable, writable) file objects, where writable
replaces readable.
If an exception occurs, the old file is restored, removing the
written data.
mode should *not* use 'w', 'a' or '+'; only read-only-modes are supported.
"""
# move existing file to backup, create new file with same permissions
# borrowed extensively from the fileinput module
if set(mode).intersection('wa+'):
raise ValueError('Only read-only file modes can be used')
backupfilename = filename + (backup_extension or os.extsep + 'bak')
try:
os.unlink(backupfilename)
except os.error:
pass
os.rename(filename, backupfilename)
readable = io.open(backupfilename, mode, buffering=buffering,
encoding=encoding, errors=errors, newline=newline)
try:
perm = os.fstat(readable.fileno()).st_mode
except OSError:
writable = open(filename, 'w' + mode.replace('r', ''),
buffering=buffering, encoding=encoding, errors=errors,
newline=newline)
else:
os_mode = os.O_CREAT | os.O_WRONLY | os.O_TRUNC
if hasattr(os, 'O_BINARY'):
os_mode |= os.O_BINARY
fd = os.open(filename, os_mode, perm)
writable = io.open(fd, "w" + mode.replace('r', ''), buffering=buffering,
encoding=encoding, errors=errors, newline=newline)
try:
if hasattr(os, 'chmod'):
os.chmod(filename, perm)
except OSError:
pass
try:
yield readable, writable
except Exception:
# move backup back
try:
os.unlink(filename)
except os.error:
pass
os.rename(backupfilename, filename)
raise
finally:
readable.close()
writable.close()
try:
os.unlink(backupfilename)
except os.error:
pass
So your code would look like:
import fileinput
text = "sample file.txt"
fields = {"pattern 1": "replacement text 1", "pattern 2": "replacement text 2"}
with inplace(text, encoding='utf8') as (infh, outfh):
for line in infh:
for field in fields:
if field in line:
line = line.replace(field, fields[field])
outfh.write(line)
Note that you don't have to remove the newline now.
I tried to use this:
with open(fileName1, "r+", encoding = "utf8", newline='') as fileIn, open(fileName1, "r+", encoding = "utf8", newline='') as fileOut:
for line in fileIn:
for field in fields:
if field in line:
line = line.replace(field, fields[field])
fileOut.write(line)
Note: When using one file, the waste is pushed at the end of the file. So far I have not figured out why. It does not reflect the number of replacements. (The number of replacements is greater than the number of lines of waste.)
Pseudo-mathematical: oriA < modfA + subEnd(oriA)
I'm ready to fix it.
Edit: When I use two files, everything works correctly. Change fileName1
in the second open()
for fileName2
. And change mod argument to "w+"
.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.