[英]How do I convert a Json from GET request into pandas dataframe
我正在使用GoToWebinar API從網絡研討會獲取數據。 我做了所有事情,但腳本中唯一缺少的是將Json轉換為大熊貓的數據框,以便我可以進行分析。
我得到的Json文件具有以下結構(我屏蔽了數據):
{
"_embedded": {
"webinars": [
{
"webinarKey": "GGGGGGGGGGGGGGGG",
"webinarId": "BBBBBBBBBBB",
"organizerKey": "RRRRRRRRRRRRR",
"omid": "RRRRRRRRRRR",
"accountKey": "WWWWWWWWWWW",
"recurrenceKey": "EEEEEEEEEEEEEEEEE21",
"subject": "LEEEEEEEEEESEon",
"description": "EEEEEEEEEEEEE",
"times": [
{
"startTime": "2019-07-01T13:00:00Z",
"endTime": "2019-07-01T13:30:00Z"
}
],
"timeZone": "America/New_York",
"locale": "en_US",
"status": "UPDATED",
"approvalType": "AUTOMATIC",
"registrationUrl": "https://attendee.gotowebinar.com/rt/XXXXXXXXXXXXXXXX",
"impromptu": false,
"isPasswordProtected": false,
"recurrenceType": "series",
"experienceType": "broadcast",
"registrationSettingsKey": "DDDDDDDD"
},
{
"webinarKey": "GGGGGGGGGGGGGGGG",
"webinarId": "BBBBBBBBBBB",
"organizerKey": "RRRRRRRRRRRRR",
"omid": "RRRRRRRRRRR",
"accountKey": "WWWWWWWWWWW",
"recurrenceKey": "EEEEEEEEEEEEEEEEE21",
"subject": "LEEEEEEEEEESEon",
"description": "EEEEEEEEEEEEE",
"times": [
{
"startTime": "2019-07-01T13:00:00Z",
"endTime": "2019-07-01T13:30:00Z"
}
],
"timeZone": "America/New_York",
"locale": "en_US",
"status": "UPDATED",
"approvalType": "AUTOMATIC",
"registrationUrl": "https://attendee.gotowebinar.com/rt/XXXXXXXXXXXXXXXX",
"impromptu": false,
"isPasswordProtected": false,
"recurrenceType": "series",
"experienceType": "broadcast",
"registrationSettingsKey": "DDDDDDDD"
},
..other webinars.....
]
},
"page": {
"size": 10,
"totalElements": 26,
"totalPages": 3,
"number": 0
}
}
這是我的代碼,我基本上不知道如何進行。 我嘗試了DataFrame.from_dict,read_json以及此處提供的解決方案: 將JSON數據從Request轉換為Pandas DataFrame
'''Getting the webinar lists'''
base_url = 'https://api.getgo.com/G2W/rest/v2'
##setting up paramters
param_1 = '2019-07-01T10%3A00%3A00Z'
param_2 = '2019-09-01T10%3A00%3A00Z'
##buidling the path
path = base_url + '/accounts/' + account_key + '/webinars?fromTime=' + param_1 +'&toTime=' + param_2
print(path)
headers = {'accept' : 'application/json' , 'Authorization' : access_token}
webinars_req = session.get(path, headers = headers)
webinars_json = webinars_req.json()
我想要一個具有所有內部標簽(例如,webinarkey,webinarid等)的數據框為具有對應值的列...
希望你們能提供幫助!
您可以嘗試requests
模塊
import requests
webinars_req = requests.get(path, headers = headers)
df = pd.read_json(webinars_req.text, ignore_index=True)
好吧,我做到了! 基本上,我只需要在網絡研討會級別從字典中獲取列表,然后將其放入數據框即可:
webinars_json = webinars_req.json()
##put all webinars data in a dataframe
webinars_list = webinars_json.get('_embedded').get('webinars')
df_webinars = pd.DataFrame(webinars_list)
它很好用:-)希望這會幫助某人
js = { "_embedded": { "webinars": [ { "webinarKey": "GGGGGGGGGGGGGGGG", "webinarId": "BBBBBBBBBBB", "organizerKey": "RRRRRRRRRRRRR", "omid": "RRRRRRRRRRR", "accountKey": "WWWWWWWWWWW", "recurrenceKey": "EEEEEEEEEEEEEEEEE21", "subject": "LEEEEEEEEEESEon", "description": "EEEEEEEEEEEEE", "times": [ { "startTime": "2019-07-01T13:00:00Z", "endTime": "2019-07-01T13:30:00Z" } ], "timeZone": "America/New_York", "locale": "en_US", "status": "UPDATED", "approvalType": "AUTOMATIC", "registrationUrl": "https://attendee.gotowebinar.com/rt/XXXXXXXXXXXXXXXX", "impromptu": "false", "isPasswordProtected": "false", "recurrenceType": "series", "experienceType": "broadcast", "registrationSettingsKey": "DDDDDDDD" }, { "webinarKey": "GGGGGGGGGGGGGGGG", "webinarId": "BBBBBBBBBBB", "organizerKey": "RRRRRRRRRRRRR", "omid": "RRRRRRRRRRR", "accountKey": "WWWWWWWWWWW", "recurrenceKey": "EEEEEEEEEEEEEEEEE21", "subject": "LEEEEEEEEEESEon", "description": "EEEEEEEEEEEEE", "times": [ { "startTime": "2019-07-01T13:00:00Z", "endTime": "2019-07-01T13:30:00Z" } ], "timeZone": "America/New_York", "locale": "en_US", "status": "UPDATED", "approvalType": "AUTOMATIC", "registrationUrl": "https://attendee.gotowebinar.com/rt/XXXXXXXXXXXXXXXX", "impromptu": "false", "isPasswordProtected": "false", "recurrenceType": "series", "experienceType": "broadcast", "registrationSettingsKey": "DDDDDDDD" } ] } }
import json
from pandas.io.json import json_normalize
s = json.dumps(js) #convert dict to string
data = json.loads(s) #load str as json
#also look at meta arguments for json_normalize
df = json_normalize(data=data['_embedded'], record_path=['webinars'])
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.