How to change into dataframe from nested json response coming from api

Question

{
    "reviews": [
        {
            "reviewId": "12a3",
            "authorName": "Muhammad Arifin",
            "comments": [
                {
                    "userComment": {
                        "text": "\tsangat terbantu👍",
                        "lastModified": {
                            "seconds": "1606819245",
                            "nanos": 835000000
                        },
                        "starRating": 5,
                        "reviewerLanguage": "id",
                        "device": "1601",
                        "androidOsVersion": 23,
                        "appVersionCode": 20365,
                        "appVersionName": "5.2.73",
                        "deviceMetadata": {
                            "productName": "1601 (1601)",
                            "manufacturer": "Vivo",
                            "deviceClass": "FORM_FACTOR_PHONE",
                            "nativePlatform": "ABI_ARM64_V8,ABI_ARM_V7,ABI_ARM",
                            "cpuModel": "MT6750",
                            "cpuMake": "Mediatek"
                        }
                    }
                },
                {
                    "developerComment": {
                        "text": "Terima kasih sudah berbagi, kami sangat senang menjadi bagian dalam pejalanan travel anda!",
                        "lastModified": {
                            "seconds": "1606818598",
                            "nanos": 722000000
                        }
                    }
                }
            ]
        }
    ]
    "tokenPagination": {
        "nextPageToken": "abc"
    }
}

I want the column name as reviewId, authorName, userComment_text, userComment_lastModified, starRating, deviceMetadata.manufacturer, developerComment.text

I have tried this:

df=pd.json_normalize(fetch_reviews_response, record_path="reviews")

but it creates only reviewId, authorName and comments column

Answer 1

Please do try this repo and see if that works out.

It uses recursive functions to achieve this. The function in the 'json_to_csv.py' can be easily ported for your use by converting the flat json result into a dataframe by simply loading it using 'pandas.read_json'.

Answer 2

Firstly I reorganized the json file like below:

    {
"reviews": {

    "reviewId": "12a3",
    "authorName": "Muhammad Arifin",
    "comments": {
        "userComment": {
                "text": "\tsangat terbantu👍",
                "lastModified": {
                    "seconds": "1606819245",
                    "nanos": 835000000
                },
                "starRating": 5,
                "reviewerLanguage": "id",
                "device": "1601",
                "androidOsVersion": 23,
                "appVersionCode": 20365,
                "appVersionName": "5.2.73",
                "deviceMetadata": {
                    "productName": "1601 (1601)",
                    "manufacturer": "Vivo",
                    "deviceClass": "FORM_FACTOR_PHONE",
                    "nativePlatform": "ABI_ARM64_V8,ABI_ARM_V7,ABI_ARM",
                    "cpuModel": "MT6750",
                    "cpuMake": "Mediatek"
                }
            },

            "developerComment": {
                "text": "Terima kasih sudah berbagi, kami sangat senang menjadi bagian dalam pejalanan travel anda!",
                "lastModified": {
                    "seconds": "1606818598",
                    "nanos": 722000000
                }
            }
        }


,
"tokenPagination": {
    "nextPageToken": "abc"
}
}
}

Then in a python file I applied some pandas functionality in order to manipulate the dataframe.

import pandas as pd

df = pd.read_json("data.json")
df['reviewId'] = df['reviews']['reviewId']
df['authorName'] = df['reviews']['authorName']
df['userComment_text'] = df['reviews']['comments']['userComment']['text']
df['userComment_lastModified'] = df['reviews']['comments']['userComment']['lastModified']['seconds']
df['starRating'] = df['reviews']['comments']['userComment']['starRating']
df['deviceMetadata.manufacturer'] = df['reviews']['comments']['userComment']['deviceMetadata']['manufacturer']
df['developerComment.text'] = df['reviews']['comments']['developerComment']['text']



print(df.head())

And here is the my output:

                                                           reviews  ...                              developerComment.text
authorName                                         Muhammad Arifin  ...  Terima kasih sudah berbagi, kami sangat senang...
comments         {'userComment': {'text': ' sangat terbantu👍', ...  ...  Terima kasih sudah berbagi, kami sangat senang...
reviewId                                                      12a3  ...  Terima kasih sudah berbagi, kami sangat senang...
tokenPagination                           {'nextPageToken': 'abc'}  ...  Terima kasih sudah berbagi, kami sangat senang...

Meanwhile, you can change the rows as you wish. I did not edit them since you did not give any information about the rows.

I hope it works for you

How to change into dataframe from nested json response coming from api

Question

2 answers

solution1
0 2020-12-01 12:58:17

solution2
0 2020-12-01 13:34:03

How to change into dataframe from nested json response coming from api

Question

2 answers

solution1 0 2020-12-01 12:58:17

solution2 0 2020-12-01 13:34:03

solution1
0 2020-12-01 12:58:17

solution2
0 2020-12-01 13:34:03