如何使用 Pandas 将嵌套字典 (json) 列表转换为自定义数据框？

Question

I am combining survey results via an API and would like to convert these results into a dataframe that looks like this:我正在通过 API 组合调查结果，并希望将这些结果转换为如下所示的数据框：

PersonId | Question | Answer | Department

To achieve this each row would have to be one question and answer pair for one person and including the department of the first question.为了实现这一点，每一行必须是一个人的一个问答对，包括第一个问题的部门。 So in this case it should look like this:所以在这种情况下，它应该是这样的：

PersonId | Question                                  | Answer | Department
       1 | I can focus on clear targets              |      3 | Department A
       1 | I am satisfied with my working environment|      4 | Department A
       2 | I can focus on clear targets              |      1 | Department B
       2 | I am satisfied with my working environment|      3 | Department B

Here is how the data looks like after retrieving it from the api and combining it.以下是从 api 检索数据并组合后的数据。 I dont need the 'answers' and 'id' keys as the 'results' contains the answers given by the participant.我不需要“答案”和“id”键，因为“结果”包含参与者给出的答案。 The answers are always in a range from 1 to 5.答案总是在 1 到 5 之间。

[
      {
        '0': {
          'title': 'What department do you work at?',
          'id': '2571050',
          'results': {
            '0': 'Department A',
            '1': '',
          },
          'answers': {
            '0': 'Department A',
            '1': 'Department B',
          }
        },
        '1': {
          'title': 'I can focus on clear targets',
          'id': '5275962',
          'results': {
            '0': '3'
          },
          'answers': {
            '0': 'Strongly disagree',
            '1': 'Strongly Agree'
          }
        },
        '2': {
          'title': 'I am satisfied with my working environment',
          'id': '5276045',
          'results': {
            '0': '4'
          },
          'answers': {
            '0': 'Strongly Disagree',
            '1': 'Strongly Agree'
          }
        },
      },
      {
        '0': {
          'title': 'What department do you work at?',
          'id': '2571050',
          'results': {
            '0': '',
            '1': 'Department B',
          },
          'answers': {
            '0': 'Department A',
            '1': 'Department B',
          }
        },
        '1': {
          'title': 'I can focus on clear targets',
          'id': '5275962',
          'results': {
            '0': '1'
          },
          'answers': {
            '0': 'Strongly disagree',
            '1': 'Strongly Agree'
          }
        },
        '2': {
          'title': 'I am satisfied with my working environment',
          'id': '5276048',
          'results': {
            '0': '3'
          },
          'answers': {
            '0': 'Strongly Disagree',
            '1': 'Strongly Agree'
          }
        }
      }
 ]

Answer 1

Be careful your JSON file contains some errors.请注意您的 JSON 文件包含一些错误。 There shouldn't be commas at the end of the last value of your dictionaries.字典的最后一个值的末尾不应该有逗号。 You should also use double quotes and not single quotes for the key/values of your dictionaries.您还应该对字典的键/值使用双引号而不是单引号。 I linked the corrected JSON file at the end of the answer.我在答案的末尾链接了更正的 JSON 文件。

To go back to your question, you could use the json and pandas library to parse your file.回到你的问题，你可以使用 json 和 pandas 库来解析你的文件。 Here is what it could look like :这是它的样子：

import json
import pandas as pd

df = pd.DataFrame({'PersonId' : [], 'Question' : [], 'Answer' : [], 'Department' : []})
i = 1
for people in data:
    # We assign an id to the answerer
    person_id = i
    i += 1
    #We retrieve the department of the answerer
    if people['0']['results']['0'] != '':
        department = people['0']['results']['0']
    else:
        department = people['0']['results']['1']
    for answer in people:
        #if we are not asking for the department :
        new_row = {'PersonId' : person_id, 'Department' : department}
        if answer != '0':
            # We collect the question and the answer
            new_row['Question'] = people[answer]['title']
            new_row['Answer'] = people[answer]['results']['0']
            df = df.append(new_row, ignore_index = True)

Output :输出：

    PersonId                                    Question Answer    Department
0       1.0                I can focus on clear targets      3  Department A
1       1.0  I am satisfied with my working environment      4  Department A
2       2.0                I can focus on clear targets      1  Department B
3       2.0  I am satisfied with my working environment      3  Department B

JSON file : JSON 文件：

[
      {
        "0": {
          "title": "What department do you work at?",
          "id": "2571050",
          "results": {
            "0": "Department A",
            "1": ""
          },
          "answers": {
            "0": "Department A",
            "1": "Department B"
          }
        },
        "1": {
          "title": "I can focus on clear targets",
          "id": "5275962",
          "results": {
            "0": "3"
          },
          "answers": {
            "0": "Strongly disagree",
            "1": "Strongly Agree"
          }
        },
        "2": {
          "title": "I am satisfied with my working environment",
          "id": "5276045",
          "results": {
            "0": "4"
          },
          "answers": {
            "0": "Strongly Disagree",
            "1": "Strongly Agree"
          }
        }
      },
      {
        "0": {
          "title": "What department do you work at?",
          "id": "2571050",
          "results": {
            "0": "",
            "1": "Department B"
          },
          "answers": {
            "0": "Department A",
            "1": "Department B"
          }
        },
        "1": {
          "title": "I can focus on clear targets",
          "id": "5275962",
          "results": {
            "0": "1"
          },
          "answers": {
            "0": "Strongly disagree",
            "1": "Strongly Agree"
          }
        },
        "2": {
          "title": "I am satisfied with my working environment",
          "id": "5276048",
          "results": {
            "0": "3"
          },
          "answers": {
            "0": "Strongly Disagree",
            "1": "Strongly Agree"
          }
        }
      }
 ]

如何使用 Pandas 将嵌套字典 (json) 列表转换为自定义数据框？

问题描述

1 个解决方案

解决方案1
2 2019-06-11 08:37:58

如何使用 Pandas 将嵌套字典 (json) 列表转换为自定义数据框？

问题描述

1 个解决方案

解决方案1 2 2019-06-11 08:37:58

解决方案1
2 2019-06-11 08:37:58