[英]Python parse string into Python dictionary of list
There are two parts to this question: 这个问题有两个部分:
I. I'd like to parse Python string into a list of dictionary. I.我想将Python字符串解析为字典列表。
****Here is the Python String**** ****这是Python字符串****
../Data.py:92 final computing result as shown below: [historic_list {id: 'A(long) 11A' startdate: 42521 numvaluelist: 0.1065599566767107 datelist: 42521}historic_list {id: 'A(short) 11B' startdate: 42521 numvaluelist: 0.0038113334533441123 datelist: 42521 }historic_list {id: 'B(long) 11C' startdate: 42521 numvaluelist: 20.061623176440904 datelist: 42521}time_statistics {job_id: '' portfolio_id: '112341'} UrlPairList {}]
****Expected Python Output:**** ****预期的Python输出:****
{
"data" :[
{
"id": "A(long) 11A"
"startdate": "42521"
"numvaluelist": "0.1065599566767107"
},
{
"id": "A(short) 11B"
"startdate": "42521"
"numvaluelist": "0.0038113334533441123"
},
{
"id": "B(long) 11C"
"startdate": "42521"
"numvaluelist": "20.061623176440904"
}
]
}
II. 二。 I need to further parse key values of id and numvaluelist.
我需要进一步解析id和numvaluelist的键值。 I am not sure if there is a better way to do it.
我不确定是否有更好的方法。 Hence, I am converting string to Python Dictionary, loop through that and parse further.
因此,我将字符串转换为Python字典,循环遍历并进一步解析。 Please guide me if I am overthinking the solution.
如果我考虑过多的解决方案,请指导我。
Update: Code 更新:代码
text = "[historic_list {id: 'A(long) 11A' startdate: 42521 numvaluelist: 0.1065599566767107 datelist: 42521}historic_list {id: 'A(short) 11B' startdate: 42521 numvaluelist: 0.0038113334533441123 datelist: 42521 }historic_list {id: 'B(long) 11C' startdate: 42521 numvaluelist: 20.061623176440904 datelist: 42521}time_statistics {job_id: '' portfolio_id: '112341'} UrlPairList {}]"
data = text.strip("../Data.py:92 final computing result as shown below: ")
print data
Your input raw text looks pretty predictable, try this: 您输入的原始文本看起来非常可预测,请尝试以下操作:
>>> import re
>>> raw = "[historic_list {id: 'A(long) 11A' startdate: 42521 numvaluelist: 0.1065599566767107 datelist: 42521}historic_list {id: 'A(short) 11B' startdate: 42521 numvaluelist: 0.0038113334533441123 datelist: 42521 }historic_list {id: 'B(long) 11C' startdate: 42521 numvaluelist: 20.061623176440904 datelist: 42521}time_statistics {job_id: '' portfolio_id: '112341'} UrlPairList {}]"
>>> line_re = re.compile(r'\{[^\}]+\}')
>>> records = line_re.findall(raw)
>>> record_re = re.compile(
... r"""
... id:\s*\'(?P<id>[^']+)\'\s*
... startdate:\s*(?P<startdate>\d+)\s*
... numvaluelist:\s*(?P<numvaluelist>[\d\.]+)\s*
... datelist:\s*(?P<datelist>\d+)\s*
... """,
... re.X
... )
>>> record_parsed = record_re.search(line_re.findall(raw)[0])
>>> record_parsed.groupdict()
{'startdate': '42521', 'numvaluelist': '0.1065599566767107', 'datelist': '42521', 'id': 'A(long) 11A'}
>>> for record in records:
... record_parsed = record_re.search(record)
... # Here is where you would do whatever you need with the fields.
To parse the subelements of the id, eg: 解析id的子元素,例如:
>>> record_re2 = re.compile(
... r"""
... id:\s*\'
... (?P<id_letter>[A-Z]+)
... \(
... (?P<id_type>[^\)]+)
... \)\s*
... (?P<id_codenum>\d+)
... (?P<id_codeletter>[A-Z]+)
... \'\s*
... startdate:\s*(?P<startdate>\d+)\s*
... numvaluelist:\s*(?P<numvaluelist>[\d\.]+)\s*
... datelist:\s*(?P<datelist>\d+)\s*
... """,
... re.X
... )
>>> record2_parsed = record_re2.search(line_re.findall(raw)[0])
>>> record2_parsed.groupdict()
{'startdate': '42521', 'numvaluelist': '0.1065599566767107', 'id_letter': 'A', 'id_codeletter': 'A', 'datelist': '42521', 'id_type': 'long', 'id_codenum': '11'}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.