繁体   English   中英

使用正则表达式将多行原始字符串转换为 JSON

[英]Multiline raw string to JSON using regex

我有一个多行原始字符串,想通过在 python 中使用正则表达式来得出一些有意义的见解。

输入:

raw_input = """this api generates something like this Source: 0
It has been looked into as a random string
event: tomorrow
 Type 0, type: verified
  sector_type: premium sector
  mailing_addr: india
  physical_address_mask: 0x00003fffffffffc0
  serialnum: 3 debit_cards: 1 mod: 0 ranking: 0 devtype: 17 row: 61504 column: 728 
  err_tpe: 2, multiline ECC
  true location: _Nt1_Ch4_Dull0 DOM_K1_"""

在上面的原始字符串中,它遵循相同的格式,以“this api”开头并以“true location”结尾

预期输出:

dict = {
‘event’: ‘tomorrow’,
‘sector_type’: ‘premium sector’,
‘mod’: ‘0’,
‘ranking’: ‘0’,
}

我正在努力寻找一种方法来处理这种多行字符串解析,并将其转换为有意义的见解,例如在 python 中使用正则表达式的 JSON。 有人可以帮助我如何实现这一目标吗?

我已经编写了一个特定于您和您的字符串规范共享的多行字符串的正则表达式。

>>> str_val = '''this api generates something like this Source: 0
... It has been looked into as a random string
... event: tomorrow
...  Type 0, type: verified
...   sector_type: premium sector
...   mailing_addr: india
...   physical_address_mask: 0x00003fffffffffc0
...   serialnum: 3 debit_cards: 1 mod: 0 ranking: 0 devtype: 17 row: 61504 column: 728 
...   err_tpe: 2, multiline ECC
...   true location: _Nt1_Ch4_Dull0 DOM_K1_'''
>>> pattern = r'this api generates something like this Source: 0\nIt has been looked into as a random string[\s\S]*(event: (\w*)).*[\s\S]*(sector_type: (.*)).*[\s\S]*(mod: (\w*)).*(ranking: (\w*)).*[\s\S]*true location'
>>> import re
>>> re.findall(pattern, str_val)
[('event: tomorrow', 'tomorrow', 'sector_type: premium sector', 'premium sector', 'mod: 0', '0', 'ranking: 0', '0')]
>>> result = re.findall(pattern, str_val)
>>> result_dict = {'event': result[0][1], 'sector_type': result[0][3], 'mod': result[0][5], 'ranking':  result[0][7]}
>>> result_dict
{'event': 'tomorrow', 'sector_type': 'premium sector', 'mod': '0', 'ranking': '0'}

您可以尝试使用相关的 Regexr 链接: https ://regexr.com/67tft

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM