简体   繁体   English

Python:将配置文件解析为字典

[英]Python: parsing a config file to dictionary

I'm trying to parse a config file for a fortran model to a python dictionary. 我正在尝试将fortran模型的配置文件解析为python字典。 The config file is basically a number of lines containing arrays or strings eg: 配置文件基本上是包含数组或字符串的许多行,例如:

0, 400, 700, 0.02, 488, 0.00026, 1, 5.3
rootname
filename
0,1,0,0,0,1
2,1,0,2,3
4,4
0.0, 0.0, 2980.9579870417047, 0.01,
...
...
...

The line number and array index tell me to which variable the value belongs. 行号和数组索引告诉我该值属于哪个变量。 So I decided to make a dictionary where the keys describe the variable and the values the indices of where the variable can be found in the config file eg: 因此,我决定制作一个字典,其中的键描述了变量,而值则是可以在配置文件中找到的变量的索引,例如:

parameters = {"var1": [3,2],
              "var2": [4,1],
              "var3": [4,2]}

So if I would read the config file as a list with .read() I could create a dictionary with the parameter values as follows: 因此,如果我使用.read()将配置文件作为列表读取,则可以使用以下参数值创建字典:

def get_parameters(config_list):
    dict_out = {}
    for key, value in parameters.items():
        dict_out[key] = config_list[value[0]][value[1]]
    return dict_out

The problem however is that the config file has a dynamic number of lines depending on the number of components in the model. 但是,问题在于配置文件具有动态行数,具体取决于模型中组件的数量。 Fortunately, the number of components (another variable) is also specified in the config file. 幸运的是,还可以在配置文件中指定组件数(另一个变量)。 Let assume the first 7 lines are static; 假设前7行是静态的; after the 7th line the number of lines are depended on the number of components. 在第7行之后,行数取决于组件数。 The number of components ncomp is 3 (specified on the 5th line and the 5th entry in the array ([4][4]). Now i want to retrieve variable var4 on line 7+ ncomp +1 and array index 2 . How do I go about this in an elegant way? 组件ncomp的数量为3(在数组的第5行和第5个条目中指定([4] [4])。现在,我想在第7+ ncomp +1行和数组索引2上检索变量var4 。我以优雅的方式去做这个吗?

I thought about adding lambda expressions to my parameter dictionary: 我考虑过将lambda表达式添加到我的parameter字典中:

parameters = {"var1": [3,2],
              "var2": [4,1],
              "var3": [4,2],
              "ncomp": [4,4],
              "var4": [lambda ncomp: ncomp+7,2]}

But this would mean that I first have to retrieve ncomp and then evaluate the lambda function to get the indices. 但这意味着我首先必须检索ncomp ,然后评估lambda函数以获取索引。 With the indices I could then retrieve the values of var4 . 使用索引,然后可以检索var4的值。 It sounds doable but I feel like there might me a more elegant way to solve this problem. 听起来可行,但是我觉得也许我可以找到一种更优雅的方式来解决此问题。 Suggestions? 有什么建议吗?

You can do it in two steps. 您可以分两步完成。

First all parameters but var4 : 首先除var4所有参数:

parameters = {"var1": [3,2],
              "var2": [4,1],
              "var3": [4,2],
              "ncomp": [4,4]}

out = get_parameters(config_list)

Now var4 : 现在var4

out["var4"] = config_list[int(out['ncomp']) + 7, 2]

Update : 更新
Since you have well defined dependencies, you will have to set up the loop accordingly while fetching parameters in the dictionary.There's no alternative as I see it. 由于定义了良好的依赖关系,因此在获取字典中的参数时必须相应地设置循环。
You do not have to bother about defining your dependent parameters in the 'parameters' dictionary since many params are dynamically read and obtained. 您不必费心在'parameters'字典中定义依赖参数,因为可以动态读取和获取许多参数。
So, only define the independent ones in 'parameters' 因此,仅在'parameters'定义独立的'parameters'
Suppose there are 2 dependency defining params like "ncomp" and "ncomp2" . 假设有两个依赖项定义参数,例如"ncomp""ncomp2" And there are 2 dependent params for each, say var10, var11, var12 and var13. 每个变量都有2个从属参数,例如var10,var11,var12和var13。
While var14 and var15 are dependent on both "ncomp" and "ncomp2" . 而var14和var15同时依赖于"ncomp""ncomp2"
You'll now group them into the respective batches as follows. 现在,您将它们分为以下各个批次。

def get_parameters(config_list):
    dict_out = {}
    for key, value in parameters.items():
        if key == "ncomp":  # for all the params dependent on number of components.
            dict_out[key] = config_list[value[0]][value[1]]
            ncomp_val = dict_out[key] # we now have the value of ncomp in out dictionary available for all its dependent parameters.

            dict_out["var10"] = config_lit[ncomp_val*2 +1][4]
            dict_out["var11"] = config_lit[ncomp_val*3 +1][2]

         elif key == "ncomp2":  # for all the params dependent on number of components.
            dict_out[key] = config_list[value[0]][value[1]]
            ncomp1_val = dict_out[key]

            dict_out["var12"] = config_lit[ncomp1_val*2 -1][3]
            dict_out["var13"] = config_lit[ncomp1_val*3 +3][4]

         elif "ncomp" in dict_out and "ncomp2" in dict_out and "var14" not in dict_out:  # for the multi-dependency params, dependent on ncomp and ncomp2.
            ncomp_val = dict_out["ncomp"]
            ncomp1_val = dict_out["ncomp1"]

            dict_out["var14"] = config_lit[ncomp_val*2 + ncomp1_val -1][3]
            dict_out["var15"] = config_lit[ncomp1_val*3 + ncomp1_val*2 +3][4]

        else:  # for all other independent params.
            dict_out[key] = config_list[value[0]][value[1]]

    return dict_out




Original Response : 原始回复
I agree with @Mike Muller's answer, however, you can avoid breaking it into 2 steps. 我同意@Mike Muller的回答,但是,您可以避免将其分为两个步骤。
While retrieving the parameters into your dictionary, you can easily check for the "ncomp" key and proceed. 在将参数检索到字典中的同时,您可以轻松地检查“ ncomp”键并继续。
Let's work with your assumption of 7 static lines, and by then, you will have gotten your "ncomp" value into your dictionary. 让我们假设您有7条静态线,那么到那时,您将已将“ ncomp”值添加到字典中。
You can now capture all your dependent parameters based on this value as follows. 现在,您可以根据以下值捕获所有从属参数。

def get_parameters(config_list):
    dict_out = {}
    for key, value in parameters.items():
        if "ncomp" in dict_out:  # for all the params dependent on number of components.
            dict_out[key] = [dict_out["ncomp"]+7+1 , 2]
        else:  # for all other independent params.
            dict_out[key] = config_list[value[0]][value[1]]
    return dict_out

@murphy1310 thanks for your thorough answer. @ murphy1310感谢您的详尽回答。 However I am not a fan of hardcoding all the indices in the function itself. 但是,我不喜欢对函数本身中的所有索引进行硬编码。 I would rather keep all the indices of the variables in a (single) dictionary so i can keep the function "clean". 我宁愿将变量的所有索引保留在(单个)字典中,这样我就可以使函数保持“干净”。 I came up with the following: 我想出了以下几点:

# config_list is a 2D list of parameter values and strings
config_list = [[...],
               [...],
               [...]]

# dictionary containing indices of where variables can be found in config_list
parameters = {
              # independent variables
              "var1": [3,2],
              "var2": [4,1],
              "var3": [4,2],
              "ncomp": [4,4],
              # dependent variables
              "var4": lambda ncomp: [ncomp+1, 1],
              "var5": lambda ncomp: [ncomp*2+1, 1]}

def get_parameters(dictin, dictout = {}):
    dictin_copy = dictin.copy()
    for key, value in dictin_copy.items():
        if not callable(value) and not dictout.get(key):
            dictout[key] = config_list[value[0]][value[1]]

    ncomp = dictout["ncomp"]
    for key, value in dictin_copy.items():
        if callable(value):
            dictin_copy[key] = value(ncomp)
            get_parameters(dictin_copy, dictout)
    return dictout

# now get parameters from config_list
new_dict = get_parameters(parameters)

I'm wondering what you guys think of this approach... 我想知道你们对这种方法的看法...

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM