簡體   English   中英

用嵌套字典展平 dataframe

[英]Flattening a dataframe with nested dictionary

     [
      {
        "match_hometeam_score": "2 ",
        "match_awayteam_score": " 0",    
        "statistics": [
              {
                "type": "Ball Possession",
                "home": "70%",
                "away": "30%"
              },
              {
                "type": "Goal Attempts",
                "home": "6",
                "away": "3"
              },
              {
                "type": "Shots on Goal",
                "home": "4",
                "away": "1"
              },
              {
                "type": "Shots off Goal",
                "home": "1",
                "away": "2"
              },
              {
                "type": "Blocked Shots",
                "home": "1",
                "away": "0"
              },
              {
                "type": "Free Kicks",
                "home": "10",
                "away": "12"
              },
              {
                "type": "Corner Kicks",
                "home": "5",
                "away": "2"
              },
              {
                "type": "Offsides",
                "home": "2",
                "away": "1"
              },
              {
                "type": "Goalkeeper Saves",
                "home": "1",
                "away": "2"
              },
              {
                "type": "Fouls",
                "home": "11",
                "away": "9"
              },
              {
                "type": "Yellow Cards",
                "home": "2",
                "away": "0"
              },
              {
                "type": "Total Passes",
                "home": "657",
                "away": "272"
              },
              {
                "type": "Tackles",
                "home": "11",
                "away": "18"
              }
            ]
          },
          .....
        ]

是我得到的 json 文件的小示例代碼。 我想要的是通過提取統計列中的值來展平它。

我試過了

flat_matches = pd.concat([all_matches.drop(['statistics'],axis=1),all_matches['statistics'].apply(pd.Series)], axis=1)

它以某種方式起作用,但不像我希望的那樣。 我想用列制作我的新df;

  1. 指數
  2. match_hometeam_score
  3. match_awayteam_score
  4. GoalAttempts_home
  5. 目標Attempts_away
  6. Shots_on_Goal_home
  7. Shots_on_Goal_away
  8. 危險攻擊_home
  9. DangerousAttacks_away

CSV代碼如下;

,match_hometeam_score,match_awayteam_score,statistics
0,3,1,"[{'type': 'Ball Possession', 'home': '44%', 'away': '56%'}, {'type': 'Goal Attempts', 'home': '15', 'away': '6'}, {'type': 'Shots on Goal', 'home': '5', 'away': '5'}, {'type': 'Shots off Goal', 'home': '9', 'away': '1'}, {'type': 'Blocked Shots', 'home': '1', 'away': '0'}, {'type': 'Corner Kicks', 'home': '3', 'away': '3'}, {'type': 'Offsides', 'home': '4', 'away': '2'}, {'type': 'Goalkeeper Saves', 'home': '4', 'away': '2'}, {'type': 'Fouls', 'home': '11', 'away': '10'}, {'type': 'Yellow Cards', 'home': '2', 'away': '4'}, {'type': 'Total Passes', 'home': '382', 'away': '503'}, {'type': 'Tackles', 'home': '13', 'away': '16'}, {'type': 'Attacks', 'home': '97', 'away': '136'}, {'type': 'Dangerous Attacks', 'home': '45', 'away': '63'}]"
1,1,2,"[{'type': 'Ball Possession', 'home': '61%', 'away': '39%'}, {'type': 'Goal Attempts', 'home': '22', 'away': '12'}, {'type': 'Shots on Goal', 'home': '10', 'away': '7'}, {'type': 'Shots off Goal', 'home': '6', 'away': '3'}, {'type': 'Blocked Shots', 'home': '6', 'away': '2'}, {'type': 'Corner Kicks', 'home': '7', 'away': '2'}, {'type': 'Offsides', 'home': '0', 'away': '2'}, {'type': 'Goalkeeper Saves', 'home': '5', 'away': '9'}, {'type': 'Fouls', 'home': '12', 'away': '13'}, {'type': 'Yellow Cards', 'home': '4', 'away': '4'}, {'type': 'Total Passes', 'home': '421', 'away': '271'}, {'type': 'Tackles', 'home': '14', 'away': '24'}, {'type': 'Attacks', 'home': '97', 'away': '86'}, {'type': 'Dangerous Attacks', 'home': '43', 'away': '46'}]"
2,1,2,"[{'type': 'Ball Possession', 'home': '48%', 'away': '52%'}, {'type': 'Goal Attempts', 'home': '16', 'away': '14'}, {'type': 'Shots on Goal', 'home': '4', 'away': '6'}, {'type': 'Shots off Goal', 'home': '6', 'away': '5'}, {'type': 'Blocked Shots', 'home': '6', 'away': '3'}, {'type': 'Corner Kicks', 'home': '4', 'away': '4'}, {'type': 'Offsides', 'home': '2', 'away': '6'}, {'type': 'Goalkeeper Saves', 'home': '4', 'away': '3'}, {'type': 'Fouls', 'home': '11', 'away': '14'}, {'type': 'Yellow Cards', 'home': '2', 'away': '7'}, {'type': 'Total Passes', 'home': '594', 'away': '643'}, {'type': 'Tackles', 'home': '24', 'away': '16'}, {'type': 'Attacks', 'home': '144', 'away': '130'}, {'type': 'Dangerous Attacks', 'home': '77', 'away': '36'}]"

非常感謝您提供的各種幫助。 請告訴我如何將這個 json 數據集展平到同一級別。 我是新手愛好者,如果我能提高我的問題的質量。 請不要猶豫給我提示。

結果我想是這樣的;

下面顯示了如何轉換 dataframe 中的給定行。 您需要迭代並創建 dataframe ,如下所示。

import json
import pandas as pd

sample_row = [{'type': 'Ball Possession', 'home': '44%', 'away': '56%'}, {'type': 'Goal Attempts', 'home': '15', 'away': '6'}, {'type': 'Shots on Goal', 'home': '5', 'away': '5'}, {'type': 'Shots off Goal', 'home': '9', 'away': '1'}, {'type': 'Blocked Shots', 'home': '1', 'away': '0'}, {'type': 'Corner Kicks', 'home': '3', 'away': '3'}, {'type': 'Offsides', 'home': '4', 'away': '2'}, {'type': 'Goalkeeper Saves', 'home': '4', 'away': '2'}, {'type': 'Fouls', 'home': '11', 'away': '10'}, {'type': 'Yellow Cards', 'home': '2', 'away': '4'}, {'type': 'Total Passes', 'home': '382', 'away': '503'}, {'type': 'Tackles', 'home': '13', 'away': '16'}, {'type': 'Attacks', 'home': '97', 'away': '136'}, {'type': 'Dangerous Attacks', 'home': '45', 'away': '63'}]

js = json.dumps(sample_row)
df = pd.json_normalize(json.loads(js))

df['match_hometeam_score'] = [3] * len(df)
df['match_awayteam_score'] = [1] * len(df)

在此處輸入圖像描述

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM