简体   繁体   中英

import nested json into pandas dataframe

JSON STR:

{
"PurchaseId": "Pur-001",
"Orders": [{
    "id": "154",
    "isOnline": false,
    "Store_location": {
        "Order-Date": "2019-06-04T07:35:00"
    },
    "Store_Network": [{
        "Network_Domain": "Food_Processing"
    }]
}],
"Sales": [{
    "id": "1856",
    "SalesLoads": [
        1000,
        3000,
        5000
    ],
    "Network": [{
        "id": "London_Store",
        "history": [
            0,
            1,
            2,
            0,
            0,
            0,
            0,
            0
        ],
        "Leads": {
            "From": "Mgmt-Dept",
            "time": "34hrs"
        }
    }]
}]

}

Expected Dataframe: 在此处输入图片说明

My code so far:

import pandas.io.json as pd_json
data = pd_json.loads(json_str)
df=pd_json.json_normalize(data, record_path='loads')

I've tried JSON_Normalize but unable to load this JSON string into dataframe. Is it possible to do it using JSON Normalize or is there any other optimized solution available.

This is pretty long, but gets the job done. Hopefully someone answers with a better solution and less verbose.

a = {
"PurchaseId": "Pur-001",
"Orders": [{
    "id": "154",
    "isOnline": False,
    "Store_location": {
        "Order-Date": "2019-06-04T07:35:00"
    },
    "Store_Network": [{
    "Network_Domain": "Food_Processing"
}]
}],
"Sales": [{
    "id": "1856",
    "SalesLoads": [
    1000,
    3000,
    5000
],
"Network": [{
    "id": "London_Store",
    "history": [
        0,
        1,
        2,
        0,
        0,
        0,
        0,
        0
    ],
    "Leads": {
        "From": "Mgmt-Dept",
        "time": "34hrs"
    }
}]
}]}

b = pd.DataFrame.from_dict(a)


b = (b.assign(Orders_id = b.Orders[0]['id'],
              Orders_isOnline = b.Orders[0]['isOnline'],
              Orders_Store_Location_Number = pd.to_datetime(b.Orders[0]['Store_location']['Order-Date'].split('T')[0])
                                               .strftime('%m/%d/%Y'),
              Orders_Store_Network_Domain = b.Orders[0]['Store_Network'][0]['Network_Domain'],
              Sales_id = b.Sales[0]['id'],
              Sales_Load = [b.Sales[0]['SalesLoads']],
              Sales_Network_id = b.Sales[0]['Network'][0]['id'],
              Sales_Network_history = [b.Sales[0]['Network'][0]['history']],
              Sales_Leads_from = b.Sales[0]['Network'][0]['Leads']['From'],
              Sales_Lead_Time = b.Sales[0]['Network'][0]['Leads']['time']                                                    
            )
      .drop(['Orders','Sales'],axis=1)
     )

b    

Directly you can import the string into a DataFrame for that you have to convert the String to Dictionay. Simply, Import JSON and convert

json_str = json.dumps(json_data
json1_data = json.loads(data)
df= json_normalize(json1_data)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM