在 Python 中展开嵌套的 JSON

Question

I'm new to Python and I'm quite stuck (I've gone through multiple other stackoverflows and other sites and still can't get this to work).我是 Python 的新手，我很困惑（我已经浏览了多个其他 stackoverflows 和其他站点，但仍然无法正常工作）。

I've the below json coming out of an API connection我有以下 json 来自 API 连接

    {
   "results":[
      {
         "group":{
            "mediaType":"chat",
            "queueId":"67d9fb5e-26b2-4db5-b062-bbcfa8d2ca0d"
         },
         "data":[
            {
               "interval":"2021-01-14T13:12:19.000Z/2022-01-14T13:12:19.000Z",
               "metrics":[
                  {
                     "metric":"nOffered",
                     "qualifier":null,
                     "stats":{
                        "max":null,
                        "min":null,
                        "count":14,
                        "count_negative":null,
                        "count_positive":null,
                        "sum":null,
                        "current":null,
                        "ratio":null,
                        "numerator":null,
                        "denominator":null,
                        "target":null
                     }
                  }
               ],
               "views":null
            }
         ]
      }
   ]
}

and what I'm mainly looking to get out of it is (or at least something as close as)而我主要想摆脱它的是（或至少接近于）

MediaType媒体类型	QueueId队列ID	NOffered不提供
Chat聊天	67d9fb5e-26b2-4db5-b062-bbcfa8d2ca0d 67d9fb5e-26b2-4db5-b062-bbcfa8d2ca0d	14 14

Is something like that possible?这样的事情可能吗？ I've tried multiple things and I either get the whole of this out in one line or just get different errors.我尝试了多种方法，但我要么在一行中完成所有操作，要么只是遇到不同的错误。

Answer 1

The error you got indicates you missed that some of your values are actually a dictionary within an array.您收到的错误表明您错过了某些值实际上是数组中的字典。

Assuming you want to flatten your json file to retrieve the following keys: mediaType , queueId , count .假设您想要展平您的 json 文件以检索以下键： mediaType 、 queueId 、 count 。

These can be retrieved by the following sample code:这些可以通过以下示例代码检索：

import json
with open(path_to_json_file, 'r') as f:
    json_dict = json.load(f)

for result in json_dict.get("results"):
    media_type = result.get("group").get("mediaType")
    queue_id = result.get("group").get("queueId")
    n_offered = result.get("data")[0].get("metrics")[0].get("count")

If your data and metrics keys will have multiple indices you will have to use a for loop to retrieve every count value accordingly.如果您的data和metrics键将有多个索引，您将不得不使用for循环来相应地检索每个count数值。

Answer 2

Assuming that the format of the API response is always the same, have you considered hardcoding the extraction of the data you want?假设 API 响应的格式始终相同，您是否考虑过硬编码您想要的数据的提取？

This should work: With response defined as the API output:这应该有效： response定义为 API output：

response =     {
   "results":[
      {
          "group":{
            "mediaType":"chat",
            "queueId":"67d9fb5e-26b2-4db5-b062-bbcfa8d2ca0d"
          },
          "data":[
            {
               "interval":"2021-01-14T13:12:19.000Z/2022-01-14T13:12:19.000Z",
               "metrics":[
                  {
                     "metric":"nOffered",
                     "qualifier":'null',
                     "stats":{
                        "max":'null',
                        "min":'null',
                        "count":14,
                        "count_negative":'null',
                        "count_positive":'null',
                        "sum":'null',
                        "current":'null',
                        "ratio":'null',
                        "numerator":'null',
                        "denominator":'null',
                        "target":'null'
                     }
                  }
               ],
               "views":'null'
            }
         ]
      }
   ]
}

You can extract the results as follows:您可以按如下方式提取结果：

results = response["results"][0]

{
    "mediaType": results["group"]["mediaType"],
    "queueId": results["group"]["queueId"],
    "nOffered": results["data"][0]["metrics"][0]["stats"]["count"]
}

which gives这使

{
    'mediaType': 'chat',
    'queueId': '67d9fb5e-26b2-4db5-b062-bbcfa8d2ca0d',
    'nOffered': 14
}

Answer 3

import pandas as pd
tree=     {
   "results":[
      {
         "group":{
            "mediaType":"chat",
            "queueId":"67d9fb5e-26b2-4db5-b062-bbcfa8d2ca0d"
         },
         "data":[
            {
               "interval":"2021-01-14T13:12:19.000Z/2022-01-14T13:12:19.000Z",
               "metrics":[
                  {
                     "metric":"nOffered",
                     "qualifier":"null",
                     "stats":{
                        "max":"null",
                        "min":"null",
                        "count":14,
                        "count_negative":"null",
                        "count_positive":"null",
                        "sum":"null",
                        "current":"null",
                        "ratio":"null",
                        "numerator":"null",
                        "denominator":"null",
                        "target":"null"
                     }
                  }
               ],
               "views":"null"
            }
         ]
      }
   ]
}


def traverse_parser_dfs(master_tree):
  flatten_tree_node = []
  def _process_leaves(tree:dict,prefix:str = "node", tree_node:dict = dict(), update:bool = True):
      is_nested = False
      if isinstance(tree,dict):
        for k in tree.keys():
            if type(tree[k]) == str:
                colName = prefix + "_" + k
                tree_node[colName] = tree[k]
            elif type(tree[k]) == dict:
                prefix += "_" + k
                leave = tree[k]
                _process_leaves(leave,prefix = prefix, tree_node = tree_node, update = False)
        for k in tree.keys():
            if type(tree[k]) == list:
                is_nested = True
                prefix += "_" + k
                for leave in tree[k]:
                    _process_leaves(leave,prefix = prefix, tree_node = tree_node.copy())
        if not is_nested and update:
            flatten_tree_node.append(tree_node)
        
  _process_leaves(master_tree)
  df = pd.DataFrame(flatten_tree_node)
  df.columns = df.columns.str.replace("@", "_")
  df.columns = df.columns.str.replace("#", "_")
  return df


print(traverse_parser_dfs(tree))


  node_results_group_mediaType  ... node_results_group_data_metrics_stats_target
0                         chat  ...                                         null

[1 rows x 16 columns]

在 Python 中展开嵌套的 JSON

问题描述

2 个解决方案

解决方案1
1 已采纳 2022-01-17 12:22:40

解决方案2
1 2022-01-17 12:26:48

解决方案3
1 2022-06-21 08:36:03

在 Python 中展开嵌套的 JSON

问题描述

2 个解决方案

解决方案1 1 已采纳 2022-01-17 12:22:40

解决方案2 1 2022-01-17 12:26:48

解决方案3 1 2022-06-21 08:36:03

解决方案1
1 已采纳 2022-01-17 12:22:40

解决方案2
1 2022-01-17 12:26:48

解决方案3
1 2022-06-21 08:36:03