简体   繁体   English

JSON读入Pandas DataFrame部分数据丢失?

[英]Parts of JSON data getting lost when read into Pandas DataFrame?

I want to do some analysis on data I have collected from a game study.我想对从游戏研究中收集的数据进行一些分析。 We store a Timestamp, the Input Type and then the Metadata for the respective rounds that were played.我们存储时间戳、输入类型,然后是所玩的各个回合的元数据。 We store it as a JSON, and I wanted to load it into a python script to generate some nice graphics with matplotlib. To use Pandas, I wanted to convert it into a.CSV for a dataframe, however some data seems to go missing when printing the DF.我们将其存储为 JSON,我想将其加载到 python 脚本中以使用 matplotlib 生成一些漂亮的图形。要使用 Pandas,我想将其转换为 dataframe8 的 .CSV,但是当 763 似乎丢失 76 到 8 时,一些数据似乎丢失打印 DF。

    with open('/Users/me/Downloads/databackup.json', 'r') as f:
        data = json.loads(f.read())
    
    multiple_level_data = pd.json_normalize(data, record_path=['gameList'],
                                            meta=[], meta_prefix='config_params_',
                                            record_prefix='dbscan_')
    
    multiple_level_data.to_csv('GameData.csv', index=False)
    df = pd.read_csv("GameData.csv")

This is what I use to convert the JSON into a CSV. Now, we create a new Timestamp every time the player reached a score of 750 in his last x rounds.这就是我用来将 JSON 转换为 CSV 的方法。现在,每当玩家在最后 x 轮中达到 750 分时,我们都会创建一个新的时间戳。 When theres only one Round, the data for that Timestamp shows up, but when theres two or more Rounds per Timestamp, the respective Data of those rounds does not show up in my df.当只有一轮时,会显示该时间戳的数据,但是当每个时间戳有两轮或更多轮时,这些轮的相应数据不会显示在我的 df 中。 Did I choose the wrong record_path or am I using the wrong method to convert this?是我选择了错误的 record_path 还是我使用了错误的方法来转换它?

{
    "gameList": [
        {
            "startingTime": "20230125204032",
            "inputType": "joyStick",
            "Rounds": [
                {
                    "durationSeconds": 128,
                    "score": 492,
                    "platformCount": {
                        "normalPlatforms": 60,
                        "movingPlatforms": 41,            #this loads in
                        "powPlatforms": 5,
                        "normalEnemies": 5,
                        "movingEnemies": 8
                    }
                },
                {
                    "durationSeconds": 62,
                    "score": 258,
                    "platformCount": {
                        "normalPlatforms": 35,
                        "movingPlatforms": 23,             #this doesn't
                        "powPlatforms": 3,
                        "normalEnemies": 2,
                        "movingEnemies": 5
                    }
                }
            ]
        },

If you want to follow the path down the levels you need to use a list.如果你想跟随你需要使用列表的级别的路径。 If you want multiple items in the same level as in the case of meta you need to map the levels in a list of lists.如果您想要与 meta 相同级别的多个项目,则需要 map 列表列表中的级别。

df = pd.json_normalize(
    data=data,
    record_path=["gameList", "Rounds"],
    meta_prefix="config_params_",
    record_prefix="dbscan_",
    meta=[["gameList", "startingTime"], ["gameList", "inputType"]]
)

Output: Output:

   dbscan_durationSeconds  dbscan_score  dbscan_platformCount.normalPlatforms  dbscan_platformCount.movingPlatforms  dbscan_platformCount.powPlatforms  dbscan_platformCount.normalEnemies  dbscan_platformCount.movingEnemies config_params_gameList.startingTime config_params_gameList.inputType
0                     128           492                                    60                                    41                                  5                                   5                                   8                      20230125204032                         joyStick
1                      62           258                                    35                                    23                                  3                                   2                                   5                      20230125204032                         joyStick

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM