如何从来自 api 的嵌套 json 响应更改为 dataframe

Question

{
    "reviews": [
        {
            "reviewId": "12a3",
            "authorName": "Muhammad Arifin",
            "comments": [
                {
                    "userComment": {
                        "text": "\tsangat terbantu👍",
                        "lastModified": {
                            "seconds": "1606819245",
                            "nanos": 835000000
                        },
                        "starRating": 5,
                        "reviewerLanguage": "id",
                        "device": "1601",
                        "androidOsVersion": 23,
                        "appVersionCode": 20365,
                        "appVersionName": "5.2.73",
                        "deviceMetadata": {
                            "productName": "1601 (1601)",
                            "manufacturer": "Vivo",
                            "deviceClass": "FORM_FACTOR_PHONE",
                            "nativePlatform": "ABI_ARM64_V8,ABI_ARM_V7,ABI_ARM",
                            "cpuModel": "MT6750",
                            "cpuMake": "Mediatek"
                        }
                    }
                },
                {
                    "developerComment": {
                        "text": "Terima kasih sudah berbagi, kami sangat senang menjadi bagian dalam pejalanan travel anda!",
                        "lastModified": {
                            "seconds": "1606818598",
                            "nanos": 722000000
                        }
                    }
                }
            ]
        }
    ]
    "tokenPagination": {
        "nextPageToken": "abc"
    }
}

I want the column name as reviewId, authorName, userComment_text, userComment_lastModified, starRating, deviceMetadata.manufacturer, developerComment.text我希望列名称为 reviewId、authorName、userComment_text、userComment_lastModified、starRating、deviceMetadata.manufacturer、developerComment.text

I have tried this:我试过这个：

df=pd.json_normalize(fetch_reviews_response, record_path="reviews")

but it creates only reviewId, authorName and comments column但它只创建 reviewId、authorName 和 comments 列

Answer 1

Please do try this repo and see if that works out.请尝试这个repo ，看看是否可行。

It uses recursive functions to achieve this.它使用递归函数来实现这一点。 The function in the 'json_to_csv.py' can be easily ported for your use by converting the flat json result into a dataframe by simply loading it using 'pandas.read_json'. 'json_to_csv.py' 中的 function 可以很容易地移植供您使用，只需使用 json_to_csv.py'p 将平面 json 结果转换为 Z6A8064B5DF4794555500553C47C55057DZ_p'p。

Answer 2

Firstly I reorganized the json file like below:首先，我重组了 json 文件，如下所示：

    {
"reviews": {

    "reviewId": "12a3",
    "authorName": "Muhammad Arifin",
    "comments": {
        "userComment": {
                "text": "\tsangat terbantu👍",
                "lastModified": {
                    "seconds": "1606819245",
                    "nanos": 835000000
                },
                "starRating": 5,
                "reviewerLanguage": "id",
                "device": "1601",
                "androidOsVersion": 23,
                "appVersionCode": 20365,
                "appVersionName": "5.2.73",
                "deviceMetadata": {
                    "productName": "1601 (1601)",
                    "manufacturer": "Vivo",
                    "deviceClass": "FORM_FACTOR_PHONE",
                    "nativePlatform": "ABI_ARM64_V8,ABI_ARM_V7,ABI_ARM",
                    "cpuModel": "MT6750",
                    "cpuMake": "Mediatek"
                }
            },

            "developerComment": {
                "text": "Terima kasih sudah berbagi, kami sangat senang menjadi bagian dalam pejalanan travel anda!",
                "lastModified": {
                    "seconds": "1606818598",
                    "nanos": 722000000
                }
            }
        }


,
"tokenPagination": {
    "nextPageToken": "abc"
}
}
}

Then in a python file I applied some pandas functionality in order to manipulate the dataframe.然后在 python 文件中，我应用了一些 pandas 功能来操作 dataframe。

import pandas as pd

df = pd.read_json("data.json")
df['reviewId'] = df['reviews']['reviewId']
df['authorName'] = df['reviews']['authorName']
df['userComment_text'] = df['reviews']['comments']['userComment']['text']
df['userComment_lastModified'] = df['reviews']['comments']['userComment']['lastModified']['seconds']
df['starRating'] = df['reviews']['comments']['userComment']['starRating']
df['deviceMetadata.manufacturer'] = df['reviews']['comments']['userComment']['deviceMetadata']['manufacturer']
df['developerComment.text'] = df['reviews']['comments']['developerComment']['text']



print(df.head())

And here is the my output:这是我的 output：

                                                           reviews  ...                              developerComment.text
authorName                                         Muhammad Arifin  ...  Terima kasih sudah berbagi, kami sangat senang...
comments         {'userComment': {'text': ' sangat terbantu👍', ...  ...  Terima kasih sudah berbagi, kami sangat senang...
reviewId                                                      12a3  ...  Terima kasih sudah berbagi, kami sangat senang...
tokenPagination                           {'nextPageToken': 'abc'}  ...  Terima kasih sudah berbagi, kami sangat senang...

Meanwhile, you can change the rows as you wish.同时，您可以根据需要更改行。 I did not edit them since you did not give any information about the rows.我没有编辑它们，因为您没有提供有关行的任何信息。

I hope it works for you我希望这个对你有用

如何从来自 api 的嵌套 json 响应更改为 dataframe

问题描述

2 个解决方案

解决方案1
0 2020-12-01 12:58:17

解决方案2
0 2020-12-01 13:34:03

如何从来自 api 的嵌套 json 响应更改为 dataframe

问题描述

2 个解决方案

解决方案1 0 2020-12-01 12:58:17

解决方案2 0 2020-12-01 13:34:03

解决方案1
0 2020-12-01 12:58:17

解决方案2
0 2020-12-01 13:34:03