简体   繁体   中英

Encodings in ConfigParser (Python)

Python 3.1.3 What I need is to read dictionary from cp1251-file using ConfigParser. My example:

config = configparser.ConfigParser()
config.optionxform = str
config.read("file.cfg")
DataStrings = config.items("DATA")
DataBase = dict()
for Dstr in DataStrings:
    str1 = Dstr[0]
    str2 = Dstr[1]
DataBase[str1] = str2

After that I'm trying to replace some words in some UTF-8 files according dictionary. But sometimes it doesn't works (for example, with symbols of "new line-carriage return"). My file in UTF-8 and configuration file (dictionary) in CP1251. Seems like trouble, I have to decode config into UTF-8. I've tryed this:

str1 = Dstr[0].encode('cp1251').decode('utf-8-sig')

But error "'utf8' codec can't decode byte 0xcf in position 0" appeared. If I use .decode('','ignore') - I just lose almost all config file. What should I do?

Python 3.1 is in the no-mans-land of Python versions. Ideally you'd upgrade to Python 3.5, which would let you do config.read("file.cfg", encoding="cp1251")

If you must stay on 3.1x, you can use the ConfigParser.readfp() method to read from a previously opened file using the correct encoding:

import configparser

config = configparser.ConfigParser()
config.optionxform = str
config_file = open("file.cfg", encoding="cp1251")
config.readfp(config_file)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM