[英]Parsing Erlang data to Python dictionary
I have an erlang script from which I would like to get some data and store it in python dictionary.我有一个 erlang 脚本,我想从中获取一些数据并将其存储在 python 字典中。 It is easy to parse the script to get string like this:
很容易解析脚本以获得这样的字符串:
{userdata,
[{tags,
[#dt{number=111},
#mp{id='X23.W'}]},
{log,
'LG22'},
{instruction,
"String that can contain characters like -, _ or numbers"}
]
}.
desired result:想要的结果:
userdata = {"tags": {"dt": {"number": 111}, "mp": {"id": "X23.W"}},
"log": "LG22",
"instruction": "String that can contain characters like -, _ or numbers"}
# "#" mark for data in "tags" is not required in this structure.
# Also value for "tags" can be any iterable structure: tuple, list or dictionary.
But I am not sure how to transfer this data into a python dictionary.但我不确定如何将这些数据传输到 python 字典中。 My first idea was to use
json.loads
but it requires many modifications (putting words into quotes marks, replacing "," with ":" and many more).我的第一个想法是使用
json.loads
但它需要很多修改(将单词放入引号中,将“,”替换为“:”等等)。
Moreover, keys in userdata are not limited to some pool.此外,userdata 中的键不限于某个池。 In this case, there are 'tags', 'log' and 'instruction', but there can be many more eg.
在这种情况下,有“标签”、“日志”和“指令”,但可以有更多,例如。 'slogan', 'ids', etc. Also, I am not sure about the order.
'slogan'、'ids' 等。此外,我不确定顺序。 I assume that the keys can appear in random order.
我假设密钥可以以随机顺序出现。
My code (it is not working for id='X23.W'
so I removed '.' from input):我的代码(它不适用于
id='X23.W'
所以我从输入中删除了 '.'):
import re
import json
in_ = """{userdata, [{tags, [#dt{number=111}, #mp{id='X23W'}]}, {log, 'LG22'}, {instruction, "String that can contain characters like -, _ or numbers"}]}"""
buff = in_.replace("{userdata, [", "")[:-2]
re_helper = re.compile(r"(#\w+)")
buff = re_helper.sub(r'\1:', buff)
partition = buff.partition("instruction")
section_to_replace = partition[0]
replacer = re.compile(r"(\w+)")
match = replacer.sub(r'"\1"', section_to_replace)
buff = ''.join([match, '"instruction"', partition[2]])
buff = buff.replace("#", "")
buff = buff.replace('",', '":')
buff = buff.replace("}, {", "}, \n{")
buff = buff.replace("=", ":")
buff = buff.replace("'", "")
temp = buff.split("\n")
userdata = {}
buff = temp[0][:-2]
buff = buff.replace("[", "{")
buff = buff.replace("]", "}")
userdata .update(json.loads(buff))
for i, v in enumerate(temp[1:]):
v = v.strip()
if v.endswith(","):
v = v[:-1]
userdata .update(json.loads(v))
print(userdata)
Output:输出:
{'tags': {'dt': {'number': '111'}, 'mp': {'id': 'X23W'}}, 'instruction': 'String that can contain characters like -, _ or numbers', 'log': 'LG22'}
import json
import re
in_ = """{userdata, [{tags, [#dt{number=111}, #mp{id='X23.W'}]}, {log, 'LG22'}, {instruction, "String that can contain characters like -, _ or numbers"}]}"""
qouted_headers = re.sub(r"\{(\w+),", r'{"\1":', in_)
changed_hashed_list_to_dict = re.sub(r"\[(#.*?)\]", r'{\1}', qouted_headers)
hashed_variables = re.sub(r'#(\w+)', r'"\1":', changed_hashed_list_to_dict)
equality_signes_replaced_and_quoted = re.sub(r'{(\w+)=', r'{"\1":', hashed_variables)
replace_single_qoutes = equality_signes_replaced_and_quoted.replace('\'', '"')
result = json.loads(replace_single_qoutes)
print(result)
Produces:产生:
{'userdata': [{'tags': {'dt': {'number': 111}, 'mp': {'id': 'X23.W'}}}, {'log': 'LG22'}, {'instruction': 'String that can contain characters like -, _ or numbers'}]}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.