简体   繁体   中英

python replace backslash

I'm trying to implement a simple helper class to interact with java-properties files. Fiddling with multiline properties I encountered a problem, that I can not get solved, maybe you can?

The unittest in the class first writes a multiline-property spanning over two lines to the property-file, then re-reads it and checks for equality. That just works. Now, if i use the class to add a third line to the property, it re-reads it with additional backslashes that I can't explain.

Here is my code:

#!/usr/bin/env python3
# -*- coding=UTF-8 -*-

import codecs
import os, re
import fileinput
import unittest

class ConfigParser:
    reProp = re.compile(r'^(?P<key>[\.\w]+)=(?P<value>.*?)(?P<ext>[\\]?)$')
    rePropExt = re.compile(r'(?P<value>.*?)(?P<ext>[\\]?)$')
    files = []

    def __init__(self, pathes=[]):
       for path in pathes:
           if os.path.isfile(path):
               self.files.append(path)

    def getOptions(self):
        result = {}
        key = ''
        val = ''

        with fileinput.input(self.files, inplace=False) as fi:
            for line in fi:
                m = self.reProp.match(line.strip())
                if m:
                    key = m.group('key')
                    val = m.group('value')
                    result[key] = val
                else:
                    m = self.rePropExt.match(line.rstrip())
                    if m:
                        val = '\n'.join((val, m.group('value')))
                        result[key] = val

        fi.close()
        return result

    def setOptions(self, updates={}):
        options = self.getOptions()
        options.update(updates)

        with fileinput.input(self.files, inplace=True) as fi:
            for line in fi:
                m = self.reProp.match(line.strip())
                if m:
                    key = m.group('key')
                    nval = options[key]
                    nval = nval.replace('\n', '\\\n')
                    print('{}={}'.format(key,nval))

            fi.close()        

class test(unittest.TestCase):
    files = ['test.properties']
    props = {'test.m.a' : 'Johnson\nTanaka'}

    def setUp(self):
        for file in self.files:
            f = codecs.open(file, encoding='utf-8', mode='w')
            for key in self.props.keys():
                val = self.props[key]
                val = re.sub('\n', '\\\n', val)
                f.write(key + '=' + val)
            f.close()

    def teardown(self):
        pass

    def test_read(self):
        c = configparser(self.files) 
        for file in self.files:
            for key in self.props.keys():
                result = c.getOptions()
                self.assertEqual(result[key],self.props[key])

    def test_write(self):
        c = ConfigParser(self.files)
        changes = {}
        for key in self.props.keys():
            changes[key] = self.change_value(self.props[key])

        c.setOptions(changes)       
        result = c.getOptions()
        print('changes: ')
        print(changes)
        print('result: ')
        print(result)
        for key in changes.keys():
            self.assertEqual(result[key],changes[key],msg=key)

    def change_value(self, value):
        return 'Smith\nJohnson\nTanaka'

if __name__ == '__main__':
    unittest.main()

Output of the testrun:

C:\pyt>propertyfileparser.py
changes:
{'test.m.a': 'Smith\nJohnson\nTanaka'}
result:
{'test.m.a': 'Smith\nJohnson\\\nTanaka'}

Any hints welcome...

Since you are adding a backslash in front of new-lines when you are writing, you have to also remove them when you are reading. Uncommenting the line that substitutes '\\n' with '\\\\n' solves the problem, but I expect this also means the file syntax is incorrect.

This happens only with the second line break, because you separate the value into an "oval" and an "nval" where the "oval" is the first line, and the "nval" the rest, and you only do the substitution on the nval.

It's also overkill to use regexp replacing to replace something that isn't a regexp. You can use val.replace('\\n', '\\\\n') instead.

I'd do this parser very differently. Well, first of all, I wouldn't do it at all, I'd use an existing parser, but if I did, I'd read the file, line by line, while handling the line continuation issue, so that I had exactly one value per item in a list. Then I'd parse each item into a key and a value with a regexp, and stick that into a dictionary.

You instead parse each line separately and join continuation lines to the values after parsing, which IMO is completely backwards.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM