简体   繁体   English

将嵌套的 json 导入到 Pandas 数据框

[英]import nested json into pandas dataframe

JSON STR: JSON 字符串:

{
"PurchaseId": "Pur-001",
"Orders": [{
    "id": "154",
    "isOnline": false,
    "Store_location": {
        "Order-Date": "2019-06-04T07:35:00"
    },
    "Store_Network": [{
        "Network_Domain": "Food_Processing"
    }]
}],
"Sales": [{
    "id": "1856",
    "SalesLoads": [
        1000,
        3000,
        5000
    ],
    "Network": [{
        "id": "London_Store",
        "history": [
            0,
            1,
            2,
            0,
            0,
            0,
            0,
            0
        ],
        "Leads": {
            "From": "Mgmt-Dept",
            "time": "34hrs"
        }
    }]
}]

} }

Expected Dataframe:预期数据帧: 在此处输入图片说明

My code so far:到目前为止我的代码:

import pandas.io.json as pd_json
data = pd_json.loads(json_str)
df=pd_json.json_normalize(data, record_path='loads')

I've tried JSON_Normalize but unable to load this JSON string into dataframe.我试过 JSON_Normalize 但无法将此 JSON 字符串加载到数据帧中。 Is it possible to do it using JSON Normalize or is there any other optimized solution available.是否可以使用 JSON Normalize 来做到这一点,或者是否有任何其他优化的解决方案可用。

This is pretty long, but gets the job done. 这很长,但是可以完成工作。 Hopefully someone answers with a better solution and less verbose. 希望有人能提供更好的解决方案和更少的冗长答案。

a = {
"PurchaseId": "Pur-001",
"Orders": [{
    "id": "154",
    "isOnline": False,
    "Store_location": {
        "Order-Date": "2019-06-04T07:35:00"
    },
    "Store_Network": [{
    "Network_Domain": "Food_Processing"
}]
}],
"Sales": [{
    "id": "1856",
    "SalesLoads": [
    1000,
    3000,
    5000
],
"Network": [{
    "id": "London_Store",
    "history": [
        0,
        1,
        2,
        0,
        0,
        0,
        0,
        0
    ],
    "Leads": {
        "From": "Mgmt-Dept",
        "time": "34hrs"
    }
}]
}]}

b = pd.DataFrame.from_dict(a)


b = (b.assign(Orders_id = b.Orders[0]['id'],
              Orders_isOnline = b.Orders[0]['isOnline'],
              Orders_Store_Location_Number = pd.to_datetime(b.Orders[0]['Store_location']['Order-Date'].split('T')[0])
                                               .strftime('%m/%d/%Y'),
              Orders_Store_Network_Domain = b.Orders[0]['Store_Network'][0]['Network_Domain'],
              Sales_id = b.Sales[0]['id'],
              Sales_Load = [b.Sales[0]['SalesLoads']],
              Sales_Network_id = b.Sales[0]['Network'][0]['id'],
              Sales_Network_history = [b.Sales[0]['Network'][0]['history']],
              Sales_Leads_from = b.Sales[0]['Network'][0]['Leads']['From'],
              Sales_Lead_Time = b.Sales[0]['Network'][0]['Leads']['time']                                                    
            )
      .drop(['Orders','Sales'],axis=1)
     )

b    

Directly you can import the string into a DataFrame for that you have to convert the String to Dictionay. 您可以直接将字符串导入到DataFrame中 ,因为您必须将String转换为Dictionay。 Simply, Import JSON and convert 只需导入JSON并进行转换

json_str = json.dumps(json_data
json1_data = json.loads(data)
df= json_normalize(json1_data)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM