简体   繁体   English

如何将列表多嵌套字典转换为 pandas DataFrame 并转换 DataFrame?

[英]How to convert a list multi-nested dictionaries to a pandas DataFrame and transform the DataFrame?

I need to convert a list of nested dictionaries (game_stats) to a pandas DataFrame.我需要将嵌套字典列表(game_stats)转换为 pandas DataFrame。 I have tried to do so with "games_stats_df" but I am getting a DataFrame with "id" and then a list of dictionaries again.我曾尝试使用“games_stats_df”这样做,但我得到了一个带有“id”的 DataFrame,然后又是一个字典列表。

print(game_stats)

[{'id': 401282099,
 'teams': [{'conference': 'SEC',
            'homeAway': 'away',
            'points': 21,
            'school': 'LSU',
            'stats': [{'category': 'rushingTDs', 'stat': '2'},
                      {'category': 'passingTDs', 'stat': '1'},
                      {'category': 'kickingPoints', 'stat': '3'},
                      {'category': 'fumblesRecovered', 'stat': '0'},
                      {'category': 'firstDowns', 'stat': '22'}]},
           {'conference': 'SEC',
            'homeAway': 'home',
            'points': 42,
            'school': 'Kentucky',
            'stats': [{'category': 'rushingTDs', 'stat': '3'},
                      {'category': 'passingTDs', 'stat': '4'},
                      {'category': 'kickingPoints', 'stat': '0'},
                      {'category': 'fumblesRecovered', 'stat': '1'},
                      {'category': 'firstDowns', 'stat': '24'}]}]}]
game_stats_df = pd.DataFrame.from_records([game.to_dict() for game in game_stats])
print(game_stats_df.head())

          id                                              teams
0  401282099  [{'school': 'LSU', 'conference': 'SEC', 'homeA...

Ideally, I am trying to get a DataFrame with the below format:理想情况下,我正在尝试获取具有以下格式的 DataFrame:

game_id    school   conference  homeAway    points  rushingTDs  passingTDs  etc
401282099   LSU      SEC          away       21        2            1

Use json_normalize :使用json_normalize

df = pd.json_normalize(game_stats, 'teams', 'id').explode('stats').reset_index()
df = pd.concat([df, pd.json_normalize(df.pop('stats'))], axis=1)
df = df.pivot_table('stat', df.columns[1:-2].tolist(), 'category').reset_index()

Output: Output:

>>> df
category conference homeAway  points    school         id  firstDowns  fumblesRecovered  kickingPoints  passingTDs  rushingTDs
0               SEC     away      21       LSU  401282099        22.0               0.0            3.0         1.0         2.0
1               SEC     home      42  Kentucky  401282099        24.0               1.0            0.0         4.0         3.0

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM