繁体   English   中英

使用规范化将嵌套的 json 响应从 API 转换为 pandas dataframe

[英]convert a nested json response from API to a pandas dataframe using normalize

我一直在尝试将 json 响应从 api 转换为完整的 panadas dataframe。我尝试了 json 标准化来实现它,不幸的是我只能将它拆分为一个级别。

response = {
    "data": 
    {
        "result": [
            {
                "agent_info": {
                        "agent_id": "q321", 
                        "instances": [
                            {
                                "last_run_end": "2023-01-19T15:15:55.491Z", 
                                "mode": "Advanced", 
                                "is_enabled": "True", 
                                "run_duration": "00:00:00:031", 
                                "name": "john", 
                                "status": "Running", 
                                "node_id": "wq"
                            }, 
                            {
                                "last_run_end": "2023-01-19T15:15:55.491Z", 
                                "mode": "Advanced", 
                                "is_enabled": "True", 
                                "run_duration": "00:00:00:031", 
                                "name": "chris", 
                                "status": "Running", 
                                "node_id": "wq"
                            }
                        ]
                    }
                }, 
                {
                "agent_info": {
                        "agent_id": "q123", 
                        "instances": [
                            {
                                "last_run_end": "2023-01-19T15:15:55.491Z", 
                                "mode": "Advanced", 
                                "is_enabled": "True", 
                                "run_duration": "00:00:00:031", 
                                "name": "john", 
                                "status": "Running", 
                                "node_id": "wq"
                            }
                        ]
                    }
                }
            ]
        },
    "status": 200, 
    "servedBy": "ABC"
}
df=pd.json_normalize(response,["data",["result",]],["status","servedBy"])
df

结果

agent_info.agent_id                               agent_info.instances  \
0                q321  [{'last_run_end': '2023-01-19T15:15:55.491Z', ...   
1                q123  [{'last_run_end': '2023-01-19T15:15:55.491Z', ...   

  status servedBy  
0    200      ABC  
1    200      ABC  

我想要的是每个键值都是一个单独的列..任何帮助或指示?

您可以先分解“agent_info.instances”,然后从分解后的值创建一个 dataframe,您将把它连接到其他列:

df = pd.json_normalize(response,["data",["result",]],["status","servedBy"]).explode('agent_info.instances').reset_index(drop=True)
nested_val = pd.DataFrame(df['agent_info.instances'].values.tolist())
print(pd.concat([df.drop('agent_info.instances', axis=1), nested_val], axis=1))

output:

  agent_info.agent_id status servedBy              last_run_end      mode is_enabled  run_duration   name   status node_id
0                q321    200      ABC  2023-01-19T15:15:55.491Z  Advanced       True  00:00:00:031   john  Running      wq
1                q321    200      ABC  2023-01-19T15:15:55.491Z  Advanced       True  00:00:00:031  chris  Running      wq
2                q123    200      ABC  2023-01-19T15:15:55.491Z  Advanced       True  00:00:00:031   john  Running      wq

这对你有用吗?

df=pd.json_normalize(
    data = response,
    record_path = ["data","result","agent_info","instances"],
    meta = ["status","servedBy",["data","result","agent_info","agent_id"]],
    record_prefix = "agent.instance.",
)
print(df.T)

Output(转置以更好地适应屏幕)

在此处输入图像描述

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM