[英]Multiline raw string to JSON using regex
我有一个多行原始字符串,想通过在 python 中使用正则表达式来得出一些有意义的见解。
输入:
raw_input = """this api generates something like this Source: 0
It has been looked into as a random string
event: tomorrow
Type 0, type: verified
sector_type: premium sector
mailing_addr: india
physical_address_mask: 0x00003fffffffffc0
serialnum: 3 debit_cards: 1 mod: 0 ranking: 0 devtype: 17 row: 61504 column: 728
err_tpe: 2, multiline ECC
true location: _Nt1_Ch4_Dull0 DOM_K1_"""
在上面的原始字符串中,它遵循相同的格式,以“this api”开头并以“true location”结尾
预期输出:
dict = {
‘event’: ‘tomorrow’,
‘sector_type’: ‘premium sector’,
‘mod’: ‘0’,
‘ranking’: ‘0’,
}
我正在努力寻找一种方法来处理这种多行字符串解析,并将其转换为有意义的见解,例如在 python 中使用正则表达式的 JSON。 有人可以帮助我如何实现这一目标吗?
我已经编写了一个特定于您和您的字符串规范共享的多行字符串的正则表达式。
>>> str_val = '''this api generates something like this Source: 0
... It has been looked into as a random string
... event: tomorrow
... Type 0, type: verified
... sector_type: premium sector
... mailing_addr: india
... physical_address_mask: 0x00003fffffffffc0
... serialnum: 3 debit_cards: 1 mod: 0 ranking: 0 devtype: 17 row: 61504 column: 728
... err_tpe: 2, multiline ECC
... true location: _Nt1_Ch4_Dull0 DOM_K1_'''
>>> pattern = r'this api generates something like this Source: 0\nIt has been looked into as a random string[\s\S]*(event: (\w*)).*[\s\S]*(sector_type: (.*)).*[\s\S]*(mod: (\w*)).*(ranking: (\w*)).*[\s\S]*true location'
>>> import re
>>> re.findall(pattern, str_val)
[('event: tomorrow', 'tomorrow', 'sector_type: premium sector', 'premium sector', 'mod: 0', '0', 'ranking: 0', '0')]
>>> result = re.findall(pattern, str_val)
>>> result_dict = {'event': result[0][1], 'sector_type': result[0][3], 'mod': result[0][5], 'ranking': result[0][7]}
>>> result_dict
{'event': 'tomorrow', 'sector_type': 'premium sector', 'mod': '0', 'ranking': '0'}
您可以尝试使用相关的 Regexr 链接: https ://regexr.com/67tft
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.