简体   繁体   English

Python将字符串解析为列表的Python字典

[英]Python parse string into Python dictionary of list

There are two parts to this question: 这个问题有两个部分:

I. I'd like to parse Python string into a list of dictionary. I.我想将Python字符串解析为字典列表。

****Here is the Python String**** ****这是Python字符串****

../Data.py:92 final computing result as shown below:  [historic_list {id: 'A(long) 11A' startdate: 42521 numvaluelist: 0.1065599566767107 datelist: 42521}historic_list {id: 'A(short) 11B' startdate: 42521 numvaluelist: 0.0038113334533441123 datelist: 42521 }historic_list {id: 'B(long) 11C' startdate: 42521 numvaluelist: 20.061623176440904 datelist: 42521}time_statistics {job_id: '' portfolio_id: '112341'} UrlPairList {}]

****Expected Python Output:**** ****预期的Python输出:****

{
  "data" :[
    {
      "id": "A(long) 11A"
      "startdate": "42521"
      "numvaluelist": "0.1065599566767107"
    },
    {
      "id": "A(short) 11B"
      "startdate": "42521"
      "numvaluelist": "0.0038113334533441123"
    },
    {
      "id": "B(long) 11C"
      "startdate": "42521"
      "numvaluelist": "20.061623176440904"
    }
  ]
}

II. 二。 I need to further parse key values of id and numvaluelist. 我需要进一步解析id和numvaluelist的键值。 I am not sure if there is a better way to do it. 我不确定是否有更好的方法。 Hence, I am converting string to Python Dictionary, loop through that and parse further. 因此,我将字符串转换为Python字典,循环遍历并进一步解析。 Please guide me if I am overthinking the solution. 如果我考虑过多的解决方案,请指导我。

Update: Code 更新:代码

text = "[historic_list {id: 'A(long) 11A' startdate: 42521 numvaluelist: 0.1065599566767107 datelist: 42521}historic_list {id: 'A(short) 11B' startdate: 42521 numvaluelist: 0.0038113334533441123 datelist: 42521 }historic_list {id: 'B(long) 11C' startdate: 42521 numvaluelist: 20.061623176440904 datelist: 42521}time_statistics {job_id: '' portfolio_id: '112341'} UrlPairList {}]"
data = text.strip("../Data.py:92 final computing result as shown below:  ")
print data

Your input raw text looks pretty predictable, try this: 您输入的原始文本看起来非常可预测,请尝试以下操作:

>>> import re

>>> raw = "[historic_list {id: 'A(long) 11A' startdate: 42521 numvaluelist: 0.1065599566767107 datelist: 42521}historic_list {id: 'A(short) 11B' startdate: 42521 numvaluelist: 0.0038113334533441123 datelist: 42521 }historic_list {id: 'B(long) 11C' startdate: 42521 numvaluelist: 20.061623176440904 datelist: 42521}time_statistics {job_id: '' portfolio_id: '112341'} UrlPairList {}]"

>>> line_re = re.compile(r'\{[^\}]+\}')
>>> records = line_re.findall(raw)

>>> record_re = re.compile(
...     r"""
...             id:\s*\'(?P<id>[^']+)\'\s*
...             startdate:\s*(?P<startdate>\d+)\s*
...             numvaluelist:\s*(?P<numvaluelist>[\d\.]+)\s*
...             datelist:\s*(?P<datelist>\d+)\s*
...             """,
...     re.X
...     )

>>> record_parsed = record_re.search(line_re.findall(raw)[0])
>>> record_parsed.groupdict()
{'startdate': '42521', 'numvaluelist': '0.1065599566767107', 'datelist': '42521', 'id': 'A(long) 11A'}

>>> for record in records:
...     record_parsed = record_re.search(record)
...     # Here is where you would do whatever you need with the fields.

To parse the subelements of the id, eg: 解析id的子元素,例如:

>>> record_re2 = re.compile(
...     r"""
...             id:\s*\'
...                     (?P<id_letter>[A-Z]+)
...                     \(
...                             (?P<id_type>[^\)]+)
...                             \)\s*
...                     (?P<id_codenum>\d+)
...                     (?P<id_codeletter>[A-Z]+)
...                     \'\s*
...             startdate:\s*(?P<startdate>\d+)\s*
...             numvaluelist:\s*(?P<numvaluelist>[\d\.]+)\s*
...             datelist:\s*(?P<datelist>\d+)\s*
...             """,
...     re.X
...     )

>>> record2_parsed = record_re2.search(line_re.findall(raw)[0])
>>> record2_parsed.groupdict()
{'startdate': '42521', 'numvaluelist': '0.1065599566767107', 'id_letter': 'A', 'id_codeletter': 'A', 'datelist': '42521', 'id_type': 'long', 'id_codenum': '11'}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM