[英]deserialize data from nested JSON using Python and Pandas
我在嵌套的 Json 中有時間序列數據,我正在努力進入扁平的 dataframe。
數據在這里: https://corona.lmao.ninja/v2/historical
平Pandas dataframe:國家|日期|病例|死亡|康復
import pandas as pd
import requests
import json
r = requests.get('https://corona.lmao.ninja/v2/historical', headers)
json_data = r.json()
現在,我可以df = pd.json_normalize(json_data, max_level=1)
但這給我留下了嵌入式列表。 我也可以df = pd.json_normalize(json_data)
但這只是為每個日期創建一個新列,隨着時間的推移這是不可持續的。
必須有一種優雅的方式來做到這一點。 最后的手段是編寫一個 Python 循環。
這是阿富汗國家數據的子集(json 數據中的第一個條目):
content = [{"country":"Afghanistan","province":None,"timeline":{"cases":{"3/13/20":7,"3/14/20":11,"3/15/20":16,"3/16/20":21,"3/17/20":22,"3/18/20":22,"3/19/20":22,"3/20/20":24,"3/21/20":24,"3/22/20":40,"3/23/20":40,"3/24/20":74,"3/25/20":84,"3/26/20":94,"3/27/20":110,"3/28/20":110,"3/29/20":120,"3/30/20":170,"3/31/20":174,"4/1/20":237,"4/2/20":273,"4/3/20":281,"4/4/20":299,"4/5/20":349,"4/6/20":367,"4/7/20":423,"4/8/20":444,"4/9/20":484,"4/10/20":521,"4/11/20":555},"deaths":{"3/13/20":0,"3/14/20":0,"3/15/20":0,"3/16/20":0,"3/17/20":0,"3/18/20":0,"3/19/20":0,"3/20/20":0,"3/21/20":0,"3/22/20":1,"3/23/20":1,"3/24/20":1,"3/25/20":2,"3/26/20":4,"3/27/20":4,"3/28/20":4,"3/29/20":4,"3/30/20":4,"3/31/20":4,"4/1/20":4,"4/2/20":6,"4/3/20":6,"4/4/20":7,"4/5/20":7,"4/6/20":11,"4/7/20":14,"4/8/20":14,"4/9/20":15,"4/10/20":15,"4/11/20":18},"recovered":{"3/13/20":0,"3/14/20":0,"3/15/20":0,"3/16/20":1,"3/17/20":1,"3/18/20":1,"3/19/20":1,"3/20/20":1,"3/21/20":1,"3/22/20":1,"3/23/20":1,"3/24/20":1,"3/25/20":2,"3/26/20":2,"3/27/20":2,"3/28/20":2,"3/29/20":2,"3/30/20":2,"3/31/20":5,"4/1/20":5,"4/2/20":10,"4/3/20":10,"4/4/20":10,"4/5/20":15,"4/6/20":18,"4/7/20":18,"4/8/20":29,"4/9/20":32,"4/10/20":32,"4/11/20":32}}}]
一種方法是讀入時間線數據,然后將國家和省份數據分配給 dataframe:
res = pd.DataFrame(content[0]['timeline']).assign(country = content[0]['country'],
province = content[0]['province']
)
res.head()
cases deaths recovered country province
3/13/20 7 0 0 Afghanistan None
3/14/20 11 0 0 Afghanistan None
3/15/20 16 0 0 Afghanistan None
3/16/20 21 0 1 Afghanistan None
3/17/20 22 0 1 Afghanistan None
請注意,整個數據都包含在一個列表中,因此索引為 0。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.