简体   繁体   中英

Parsing Erlang data to Python dictionary

I have an erlang script from which I would like to get some data and store it in python dictionary. It is easy to parse the script to get string like this:

    {userdata,
     [{tags,
       [#dt{number=111},
        #mp{id='X23.W'}]},
      {log,
       'LG22'},
      {instruction,
       "String that can contain characters like -, _ or numbers"}
     ]
    }.

desired result:

userdata = {"tags": {"dt": {"number": 111}, "mp": {"id": "X23.W"}},
            "log": "LG22",
            "instruction": "String that can contain characters like -, _ or numbers"}
# "#" mark for data in "tags" is not required in this structure. 
# Also value for "tags" can be any iterable structure: tuple, list or dictionary.

But I am not sure how to transfer this data into a python dictionary. My first idea was to use json.loads but it requires many modifications (putting words into quotes marks, replacing "," with ":" and many more).

Moreover, keys in userdata are not limited to some pool. In this case, there are 'tags', 'log' and 'instruction', but there can be many more eg. 'slogan', 'ids', etc. Also, I am not sure about the order. I assume that the keys can appear in random order.

My code (it is not working for id='X23.W' so I removed '.' from input):

import re
import json
in_ = """{userdata, [{tags, [#dt{number=111}, #mp{id='X23W'}]}, {log, 'LG22'}, {instruction, "String that can contain characters like -, _ or numbers"}]}"""

buff = in_.replace("{userdata, [", "")[:-2]

re_helper = re.compile(r"(#\w+)")
buff = re_helper.sub(r'\1:', buff)

partition = buff.partition("instruction")
section_to_replace = partition[0]
replacer = re.compile(r"(\w+)")
match = replacer.sub(r'"\1"', section_to_replace)
buff = ''.join([match, '"instruction"', partition[2]])
buff = buff.replace("#", "")
buff = buff.replace('",', '":')

buff = buff.replace("}, {", "}, \n{")
buff = buff.replace("=", ":")
buff = buff.replace("'", "")
temp = buff.split("\n")
userdata = {}
buff = temp[0][:-2]
buff = buff.replace("[", "{")
buff = buff.replace("]", "}")

userdata .update(json.loads(buff))
for i, v in enumerate(temp[1:]):
    v = v.strip()
    if v.endswith(","):
        v = v[:-1]
    userdata .update(json.loads(v))

print(userdata)

Output:

{'tags': {'dt': {'number': '111'}, 'mp': {'id': 'X23W'}}, 'instruction': 'String that can contain characters like -, _ or numbers', 'log': 'LG22'}
import json
import re
in_ = """{userdata, [{tags, [#dt{number=111}, #mp{id='X23.W'}]}, {log, 'LG22'}, {instruction, "String that can contain characters like -, _ or numbers"}]}"""


qouted_headers = re.sub(r"\{(\w+),", r'{"\1":', in_)
changed_hashed_list_to_dict = re.sub(r"\[(#.*?)\]", r'{\1}', qouted_headers)

hashed_variables = re.sub(r'#(\w+)', r'"\1":', changed_hashed_list_to_dict)
equality_signes_replaced_and_quoted = re.sub(r'{(\w+)=', r'{"\1":', hashed_variables)
replace_single_qoutes = equality_signes_replaced_and_quoted.replace('\'', '"')

result = json.loads(replace_single_qoutes)
print(result)

Produces:

{'userdata': [{'tags': {'dt': {'number': 111}, 'mp': {'id': 'X23.W'}}}, {'log': 'LG22'}, {'instruction': 'String that can contain characters like -, _ or numbers'}]}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM