简体   繁体   English

解析复杂配置文件的最佳方法

[英]Best way to parse complex configuration file

I need to parse a complex configuration file using Python. 我需要使用Python解析复杂的配置文件。 I should note that the format of these files is something I cannot change, but rather have to live with. 我应该注意,这些文件的格式是我无法更改的,但必须忍受。

The basic structure of the file is this: 该文件的基本结构是这样的:

Keyword1
"value1"
thisisirrelevantforkeyword1
Keyword2
"first", "second", "third"
1, 2, 3

Keyword3
2, "whatever"
firstparam, 1
secondparam, 2
again_not_relevant

Ultimately, the output of this should be a JSON string. 最终,此输出应为JSON字符串。

Let me explain: 让我解释:

  • Each keyword has its own rules. 每个关键字都有自己的规则。
  • The values are in the line(s) following the keyword. 值在关键字后面的行中。
  • For example, Keyword1 has one value, which is the string value1 . 例如, 关键字1具有一个值,即字符串value1 The line following value1 is irrelevant. 跟随值1的行无关紧要。
  • For example, Keyword2 has two parameters, the first one being a list of strings, the second one a list of integers. 例如, Keyword2有两个参数,第一个是字符串列表,第二个是整数列表。
  • For example, Keyword3 has a variable number of parameters, being indicated by the first integer in the first line after Keyword3 . 例如, 关键字3具有可变数量的参数,由关键字3之后第一行中的第一个整数表示。 So the parameters relevant for Keyword3 are the list 2, "whatever" , and the two lists in the two following lines. 因此,与Keyword3相关的参数是列表2“ whatever”和以下两行中的两个列表。

There is a fixed set of keywords with its own rules. 有一组固定的关键字及其规则。 Of course, I could in principle hard-code the whole thing, which would lead to a lot of code duplication. 当然,原则上我可以对整个事情进行硬编码,这将导致很多代码重复。 Plus, this would be quite inflexible regarding new keywords, or changing rules for single keywords. 另外,对于新关键字或更改单个关键字的规则,这将非常僵化。

I'd rather prepare a CSV file containing all keywords, with the rule how it is defined, and then use this as input for a more generic parser function. 我宁愿准备一个包含所有关键字的CSV文件,并定义规则,然后将其用作更通用的解析器功能的输入。

So my question is: - How do I specify the rules in a simple way? 所以我的问题是:-如何以简单的方式指定规则? I'm sure there's standards for this, but have absolutely no idea where to start looking. 我敢肯定有这个标准,但绝对不知道从哪里开始。 - How could I then use this grammar to parse the file and generate my JSON? -然后如何使用该语法来解析文件并生成JSON?

I know this is a quite specific, special, and complex thing; 我知道这是非常具体,特殊和复杂的事情; so I'd already be thankful for pointers in the right direction, as I feel a bit lost and am unsure where to start looking. 因此,我已经为正确方向的指针而感到感谢,因为我感到有些迷茫,并且不确定从哪里开始寻找。

I think you could have some classes for your options which have really special rules. 我认为您可以为某些选项设置一些类,这些类具有特殊的规则。

Something like that : 像这样:

class OptionBase(object):
    def __init__(self, name, **options):
        self.name = name
        self.raw_config_lines = []

    def parse_line(self, line):
        line = line.strip()
        if line:
            self.raw_config_lines.append(line)

    def get_config(self):
        raise Exception('Not Implemented')


class SimpleOption(OptionBase):
    def __init__(self, name, **options):
        super(SimpleOption, self).__init__(name, **options)
        self.expected_format = options.get('expected_format', str)

    def parse_line(self, line):
        if len(self.raw_config_lines):
            raise Exception('SimpleOption can only have one value')
        else:
            super(SimpleOption, self).parse_line(line)

    def get_config(self):
        return None if not self.raw_config_lines else self.expected_format(self.raw_config_lines[0])


class SomeComplexOption(OptionBase):
    def parse_line(self, line):
        #some special code which verify number of lines, type of args etc.

    def get_config(self):
        #some code to transform raw_line in another format

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM