簡體   English   中英

使用 Python 和 Pandas 反序列化來自嵌套 JSON 的數據

[英]deserialize data from nested JSON using Python and Pandas

我在嵌套的 Json 中有時間序列數據,我正在努力進入扁平的 dataframe。

輸入數據

數據在這里: https://corona.lmao.ninja/v2/historical

預計 Output

平Pandas dataframe:國家|日期|病例|死亡|康復

我試過的

import pandas as pd
import requests
import json

r = requests.get('https://corona.lmao.ninja/v2/historical', headers)
json_data = r.json()

現在,我可以df = pd.json_normalize(json_data, max_level=1)但這給我留下了嵌入式列表。 我也可以df = pd.json_normalize(json_data)但這只是為每個日期創建一個新列,隨着時間的推移這是不可持續的。

必須有一種優雅的方式來做到這一點。 最后的手段是編寫一個 Python 循環。

這是阿富汗國家數據的子集(json 數據中的第一個條目):

content = [{"country":"Afghanistan","province":None,"timeline":{"cases":{"3/13/20":7,"3/14/20":11,"3/15/20":16,"3/16/20":21,"3/17/20":22,"3/18/20":22,"3/19/20":22,"3/20/20":24,"3/21/20":24,"3/22/20":40,"3/23/20":40,"3/24/20":74,"3/25/20":84,"3/26/20":94,"3/27/20":110,"3/28/20":110,"3/29/20":120,"3/30/20":170,"3/31/20":174,"4/1/20":237,"4/2/20":273,"4/3/20":281,"4/4/20":299,"4/5/20":349,"4/6/20":367,"4/7/20":423,"4/8/20":444,"4/9/20":484,"4/10/20":521,"4/11/20":555},"deaths":{"3/13/20":0,"3/14/20":0,"3/15/20":0,"3/16/20":0,"3/17/20":0,"3/18/20":0,"3/19/20":0,"3/20/20":0,"3/21/20":0,"3/22/20":1,"3/23/20":1,"3/24/20":1,"3/25/20":2,"3/26/20":4,"3/27/20":4,"3/28/20":4,"3/29/20":4,"3/30/20":4,"3/31/20":4,"4/1/20":4,"4/2/20":6,"4/3/20":6,"4/4/20":7,"4/5/20":7,"4/6/20":11,"4/7/20":14,"4/8/20":14,"4/9/20":15,"4/10/20":15,"4/11/20":18},"recovered":{"3/13/20":0,"3/14/20":0,"3/15/20":0,"3/16/20":1,"3/17/20":1,"3/18/20":1,"3/19/20":1,"3/20/20":1,"3/21/20":1,"3/22/20":1,"3/23/20":1,"3/24/20":1,"3/25/20":2,"3/26/20":2,"3/27/20":2,"3/28/20":2,"3/29/20":2,"3/30/20":2,"3/31/20":5,"4/1/20":5,"4/2/20":10,"4/3/20":10,"4/4/20":10,"4/5/20":15,"4/6/20":18,"4/7/20":18,"4/8/20":29,"4/9/20":32,"4/10/20":32,"4/11/20":32}}}]

一種方法是讀入時間線數據,然后國家和省份數據分配給 dataframe:

res = pd.DataFrame(content[0]['timeline']).assign(country = content[0]['country'],
                                                  province = content[0]['province']
                                                  )

res.head()


         cases    deaths    recovered   country    province
3/13/20   7          0        0        Afghanistan  None
3/14/20   11         0        0        Afghanistan  None
3/15/20   16         0        0        Afghanistan  None
3/16/20   21         0        1        Afghanistan  None
3/17/20   22         0        1        Afghanistan  None

請注意,整個數據都包含在一個列表中,因此索引為 0。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM