简体   繁体   中英

How to convert list of nested dicts (json) into a custom dataframe using pandas?

I am combining survey results via an API and would like to convert these results into a dataframe that looks like this:

PersonId | Question | Answer | Department

To achieve this each row would have to be one question and answer pair for one person and including the department of the first question. So in this case it should look like this:

PersonId | Question                                  | Answer | Department
       1 | I can focus on clear targets              |      3 | Department A
       1 | I am satisfied with my working environment|      4 | Department A
       2 | I can focus on clear targets              |      1 | Department B
       2 | I am satisfied with my working environment|      3 | Department B

Here is how the data looks like after retrieving it from the api and combining it. I dont need the 'answers' and 'id' keys as the 'results' contains the answers given by the participant. The answers are always in a range from 1 to 5.

[
      {
        '0': {
          'title': 'What department do you work at?',
          'id': '2571050',
          'results': {
            '0': 'Department A',
            '1': '',
          },
          'answers': {
            '0': 'Department A',
            '1': 'Department B',
          }
        },
        '1': {
          'title': 'I can focus on clear targets',
          'id': '5275962',
          'results': {
            '0': '3'
          },
          'answers': {
            '0': 'Strongly disagree',
            '1': 'Strongly Agree'
          }
        },
        '2': {
          'title': 'I am satisfied with my working environment',
          'id': '5276045',
          'results': {
            '0': '4'
          },
          'answers': {
            '0': 'Strongly Disagree',
            '1': 'Strongly Agree'
          }
        },
      },
      {
        '0': {
          'title': 'What department do you work at?',
          'id': '2571050',
          'results': {
            '0': '',
            '1': 'Department B',
          },
          'answers': {
            '0': 'Department A',
            '1': 'Department B',
          }
        },
        '1': {
          'title': 'I can focus on clear targets',
          'id': '5275962',
          'results': {
            '0': '1'
          },
          'answers': {
            '0': 'Strongly disagree',
            '1': 'Strongly Agree'
          }
        },
        '2': {
          'title': 'I am satisfied with my working environment',
          'id': '5276048',
          'results': {
            '0': '3'
          },
          'answers': {
            '0': 'Strongly Disagree',
            '1': 'Strongly Agree'
          }
        }
      }
 ]

Be careful your JSON file contains some errors. There shouldn't be commas at the end of the last value of your dictionaries. You should also use double quotes and not single quotes for the key/values of your dictionaries. I linked the corrected JSON file at the end of the answer.

To go back to your question, you could use the json and pandas library to parse your file. Here is what it could look like :

import json
import pandas as pd

df = pd.DataFrame({'PersonId' : [], 'Question' : [], 'Answer' : [], 'Department' : []})
i = 1
for people in data:
    # We assign an id to the answerer
    person_id = i
    i += 1
    #We retrieve the department of the answerer
    if people['0']['results']['0'] != '':
        department = people['0']['results']['0']
    else:
        department = people['0']['results']['1']
    for answer in people:
        #if we are not asking for the department :
        new_row = {'PersonId' : person_id, 'Department' : department}
        if answer != '0':
            # We collect the question and the answer
            new_row['Question'] = people[answer]['title']
            new_row['Answer'] = people[answer]['results']['0']
            df = df.append(new_row, ignore_index = True)

Output :

    PersonId                                    Question Answer    Department
0       1.0                I can focus on clear targets      3  Department A
1       1.0  I am satisfied with my working environment      4  Department A
2       2.0                I can focus on clear targets      1  Department B
3       2.0  I am satisfied with my working environment      3  Department B

JSON file :

[
      {
        "0": {
          "title": "What department do you work at?",
          "id": "2571050",
          "results": {
            "0": "Department A",
            "1": ""
          },
          "answers": {
            "0": "Department A",
            "1": "Department B"
          }
        },
        "1": {
          "title": "I can focus on clear targets",
          "id": "5275962",
          "results": {
            "0": "3"
          },
          "answers": {
            "0": "Strongly disagree",
            "1": "Strongly Agree"
          }
        },
        "2": {
          "title": "I am satisfied with my working environment",
          "id": "5276045",
          "results": {
            "0": "4"
          },
          "answers": {
            "0": "Strongly Disagree",
            "1": "Strongly Agree"
          }
        }
      },
      {
        "0": {
          "title": "What department do you work at?",
          "id": "2571050",
          "results": {
            "0": "",
            "1": "Department B"
          },
          "answers": {
            "0": "Department A",
            "1": "Department B"
          }
        },
        "1": {
          "title": "I can focus on clear targets",
          "id": "5275962",
          "results": {
            "0": "1"
          },
          "answers": {
            "0": "Strongly disagree",
            "1": "Strongly Agree"
          }
        },
        "2": {
          "title": "I am satisfied with my working environment",
          "id": "5276048",
          "results": {
            "0": "3"
          },
          "answers": {
            "0": "Strongly Disagree",
            "1": "Strongly Agree"
          }
        }
      }
 ]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM