简体   繁体   English

在JSON中分别选择对应字符串的记录

[英]Selecting the records with corresponding string separately within JSON

Want to select BBBBB@4## , AAAAA@5## , AAAAA@6## separately想要分别选择BBBBB@4## , AAAAA@5## , AAAAA@6##

JSON JSON

x = {'d': 'BBBBB@4##{"pp-0": "1000", "pp-1": "1001", "pp-2": "1002", "pp-3": "1003", "pp-4": "1004", "pp-5": "1005", "pp-6": "1006", "pp-7": "1007", "pp-8": "1008", "pp-9": "1009", "pp-10": "1010", "pp-11": "1011", "pp-12": "1012", "pp-13": "1013", "pp-14": "1014", "pp-17": "1015", "pp-27": "1016"}AAAAA@5##{"pp-0": "1000", "pp-1": "1001", "pp-2": "1002", "pp-3": "1003", "pp-4": "1004", "pp-5": "1005", "pp-6": "1006", "pp-7": "1007", "pp-8": "1008", "pp-9": "1009", "pp-10": "1010", "pp-11": "1011", "pp-12": "1012", "pp-13": "1013", "pp-14": "1014", "pp-17": "1015", "pp-27": "1016"}AAAAA@6##{"pp-0": "1000", "pp-1": "1001", "pp-2": "1002", "pp-3": "1003", "pp-4": "1004", "pp-5": "1005", "pp-6": "1006", "pp-7": "1007", "pp-8": "1008", "pp-9": "1009", "pp-10": "1010", "pp-11": "1011", "pp-12": "1012", "pp-13": "1013", "pp-14": "1014", "pp-17": "1015", "pp-27": "1016"}'}

Python Code Python代码

x['d']['AAAAA@5##']

Expected Selected Result预期的选定结果

{"pp-0": "1000", "pp-1": "1001", "pp-2": "1002", "pp-3": "1003", "pp-4": "1004", "pp-5": "1005", "pp-6": "1006", "pp-7": "1007", "pp-8": "1008", "pp-9": "1009", "pp-10": "1010", "pp-11": "1011", "pp-12": "1012", "pp-13": "1013", "pp-14": "1014", "pp-17": "1015", "pp-27": "1016"}

I'd suggest a simple regex to extract each AAAA@X##{...} block and use them to build a new dictionary我建议使用一个简单的正则表达式来提取每个AAAA@X##{...}块并使用它们来构建一个新字典

import json
import re

x = {'d': 'AAAAA@4##{"pp-0": "1000", "pp-1": "1001", "pp-2": "1002", "pp-3": "1003", "pp-4": "1004", '
          '"pp-5": "1005", "pp-6": "1006", "pp-7": "1007", "pp-8": "1008", "pp-9": "1009", "pp-10": "1010", '
          '"pp-11": "1011", "pp-12": "1012", "pp-13": "1013", "pp-14": "1014", "pp-17": "1015", "pp-27": "1016"}'
          'AAAAA@5##{"pp-0": "1000", "pp-1": "1001", "pp-2": "1002", "pp-3": "1003", "pp-4": "1004", '
          '"pp-5": "1005", "pp-6": "1006", "pp-7": "1007", "pp-8": "1008", "pp-9": "1009", "pp-10": "1010", '
          '"pp-11": "1011", "pp-12": "1012", "pp-13": "1013", "pp-14": "1014", "pp-17": "1015", "pp-27": "1016"}'
          'AAAAA@6##{"pp-0": "1000", "pp-1": "1001", "pp-2": "1002", "pp-3": "1003", "pp-4": "1004", '
          '"pp-5": "1005", "pp-6": "1006", "pp-7": "1007", "pp-8": "1008", "pp-9": "1009", "pp-10": "1010", '
          '"pp-11": "1011", "pp-12": "1012", "pp-13": "1013", "pp-14": "1014", "pp-17": "1015", "pp-27": "1016"}'}

result = {}
for key, val in re.findall(r"(AAAAA@\d+##)({.*?})", x['d']):
    result[key] = json.loads(val)

print(result.keys())  # dict_keys(['AAAAA@4##', 'AAAAA@5##', 'AAAAA@6##'])
print(result['AAAAA@4##'])  # {'pp-0': '1000', ...  'pp-27': '1016'}

If the identifier can be another form, change the regex, here's few examples如果标识符可以是另一种形式,请更改正则表达式,这里有几个例子

  • ([AZ]{5}@\\d+##) : 5 uppercase letters instead of 5 A ([AZ]{5}@\\d+##) : 5 个大写字母而不是 5 个 A
  • ([A-Za-z]{2,5}@\\d+##) : 2 to 5 letters instead of 5 A ([A-Za-z]{2,5}@\\d+##) : 2 到 5 个字母而不是 5 A

The value under the key 'd' is a string and not even a valid json.键 'd' 下的值是一个字符串,甚至不是有效的 json。

The right answer is to fix the data format before trying using it.正确的答案是在尝试使用之前修复数据格式。

... but if you couldn't you'll have to try to convert it to a valid json like this ( which is far from beeing bullet proof ): ...但如果你不能,你将不得不尝试将其转换为这样的有效 json(这远非防弹):

>>> x['d'] = json.loads('{"' + x['d'].replace('##{', '##": {').replace('}A','}, "') + "}")
>>> x['d']['AAAA@5##']
{'pp-0': '1000', 'pp-1': '1001', 'pp-2': '1002', 'pp-3': '1003', 'pp-4': '1004', 'pp-5': '1005', 'pp-6': '1006', 'pp-7': '1007', 'pp-8': '1008', 'pp-9': '1009', 'pp-10': '1010', 'pp-11': '1011', 'pp-12': '1012', 'pp-13': '1013', 'pp-14': '1014', 'pp-17': '1015', 'pp-27': '1016'}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM