[英]How to convert list of nested dicts (json) into a custom dataframe using pandas?
我正在通過 API 組合調查結果,並希望將這些結果轉換為如下所示的數據框:
PersonId | Question | Answer | Department
為了實現這一點,每一行必須是一個人的一個問答對,包括第一個問題的部門。 所以在這種情況下,它應該是這樣的:
PersonId | Question | Answer | Department
1 | I can focus on clear targets | 3 | Department A
1 | I am satisfied with my working environment| 4 | Department A
2 | I can focus on clear targets | 1 | Department B
2 | I am satisfied with my working environment| 3 | Department B
以下是從 api 檢索數據並組合后的數據。 我不需要“答案”和“id”鍵,因為“結果”包含參與者給出的答案。 答案總是在 1 到 5 之間。
[
{
'0': {
'title': 'What department do you work at?',
'id': '2571050',
'results': {
'0': 'Department A',
'1': '',
},
'answers': {
'0': 'Department A',
'1': 'Department B',
}
},
'1': {
'title': 'I can focus on clear targets',
'id': '5275962',
'results': {
'0': '3'
},
'answers': {
'0': 'Strongly disagree',
'1': 'Strongly Agree'
}
},
'2': {
'title': 'I am satisfied with my working environment',
'id': '5276045',
'results': {
'0': '4'
},
'answers': {
'0': 'Strongly Disagree',
'1': 'Strongly Agree'
}
},
},
{
'0': {
'title': 'What department do you work at?',
'id': '2571050',
'results': {
'0': '',
'1': 'Department B',
},
'answers': {
'0': 'Department A',
'1': 'Department B',
}
},
'1': {
'title': 'I can focus on clear targets',
'id': '5275962',
'results': {
'0': '1'
},
'answers': {
'0': 'Strongly disagree',
'1': 'Strongly Agree'
}
},
'2': {
'title': 'I am satisfied with my working environment',
'id': '5276048',
'results': {
'0': '3'
},
'answers': {
'0': 'Strongly Disagree',
'1': 'Strongly Agree'
}
}
}
]
請注意您的 JSON 文件包含一些錯誤。 字典的最后一個值的末尾不應該有逗號。 您還應該對字典的鍵/值使用雙引號而不是單引號。 我在答案的末尾鏈接了更正的 JSON 文件。
回到你的問題,你可以使用 json 和 pandas 庫來解析你的文件。 這是它的樣子:
import json
import pandas as pd
df = pd.DataFrame({'PersonId' : [], 'Question' : [], 'Answer' : [], 'Department' : []})
i = 1
for people in data:
# We assign an id to the answerer
person_id = i
i += 1
#We retrieve the department of the answerer
if people['0']['results']['0'] != '':
department = people['0']['results']['0']
else:
department = people['0']['results']['1']
for answer in people:
#if we are not asking for the department :
new_row = {'PersonId' : person_id, 'Department' : department}
if answer != '0':
# We collect the question and the answer
new_row['Question'] = people[answer]['title']
new_row['Answer'] = people[answer]['results']['0']
df = df.append(new_row, ignore_index = True)
輸出 :
PersonId Question Answer Department
0 1.0 I can focus on clear targets 3 Department A
1 1.0 I am satisfied with my working environment 4 Department A
2 2.0 I can focus on clear targets 1 Department B
3 2.0 I am satisfied with my working environment 3 Department B
JSON 文件:
[
{
"0": {
"title": "What department do you work at?",
"id": "2571050",
"results": {
"0": "Department A",
"1": ""
},
"answers": {
"0": "Department A",
"1": "Department B"
}
},
"1": {
"title": "I can focus on clear targets",
"id": "5275962",
"results": {
"0": "3"
},
"answers": {
"0": "Strongly disagree",
"1": "Strongly Agree"
}
},
"2": {
"title": "I am satisfied with my working environment",
"id": "5276045",
"results": {
"0": "4"
},
"answers": {
"0": "Strongly Disagree",
"1": "Strongly Agree"
}
}
},
{
"0": {
"title": "What department do you work at?",
"id": "2571050",
"results": {
"0": "",
"1": "Department B"
},
"answers": {
"0": "Department A",
"1": "Department B"
}
},
"1": {
"title": "I can focus on clear targets",
"id": "5275962",
"results": {
"0": "1"
},
"answers": {
"0": "Strongly disagree",
"1": "Strongly Agree"
}
},
"2": {
"title": "I am satisfied with my working environment",
"id": "5276048",
"results": {
"0": "3"
},
"answers": {
"0": "Strongly Disagree",
"1": "Strongly Agree"
}
}
}
]
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.