[英]Flattening a dataframe with nested dictionary
[
{
"match_hometeam_score": "2 ",
"match_awayteam_score": " 0",
"statistics": [
{
"type": "Ball Possession",
"home": "70%",
"away": "30%"
},
{
"type": "Goal Attempts",
"home": "6",
"away": "3"
},
{
"type": "Shots on Goal",
"home": "4",
"away": "1"
},
{
"type": "Shots off Goal",
"home": "1",
"away": "2"
},
{
"type": "Blocked Shots",
"home": "1",
"away": "0"
},
{
"type": "Free Kicks",
"home": "10",
"away": "12"
},
{
"type": "Corner Kicks",
"home": "5",
"away": "2"
},
{
"type": "Offsides",
"home": "2",
"away": "1"
},
{
"type": "Goalkeeper Saves",
"home": "1",
"away": "2"
},
{
"type": "Fouls",
"home": "11",
"away": "9"
},
{
"type": "Yellow Cards",
"home": "2",
"away": "0"
},
{
"type": "Total Passes",
"home": "657",
"away": "272"
},
{
"type": "Tackles",
"home": "11",
"away": "18"
}
]
},
.....
]
這是我得到的 json 文件的小示例代碼。 我想要的是通過提取統計列中的值來展平它。
我試過了
flat_matches = pd.concat([all_matches.drop(['statistics'],axis=1),all_matches['statistics'].apply(pd.Series)], axis=1)
它以某種方式起作用,但不像我希望的那樣。 我想用列制作我的新df;
CSV代碼如下;
,match_hometeam_score,match_awayteam_score,statistics 0,3,1,"[{'type': 'Ball Possession', 'home': '44%', 'away': '56%'}, {'type': 'Goal Attempts', 'home': '15', 'away': '6'}, {'type': 'Shots on Goal', 'home': '5', 'away': '5'}, {'type': 'Shots off Goal', 'home': '9', 'away': '1'}, {'type': 'Blocked Shots', 'home': '1', 'away': '0'}, {'type': 'Corner Kicks', 'home': '3', 'away': '3'}, {'type': 'Offsides', 'home': '4', 'away': '2'}, {'type': 'Goalkeeper Saves', 'home': '4', 'away': '2'}, {'type': 'Fouls', 'home': '11', 'away': '10'}, {'type': 'Yellow Cards', 'home': '2', 'away': '4'}, {'type': 'Total Passes', 'home': '382', 'away': '503'}, {'type': 'Tackles', 'home': '13', 'away': '16'}, {'type': 'Attacks', 'home': '97', 'away': '136'}, {'type': 'Dangerous Attacks', 'home': '45', 'away': '63'}]" 1,1,2,"[{'type': 'Ball Possession', 'home': '61%', 'away': '39%'}, {'type': 'Goal Attempts', 'home': '22', 'away': '12'}, {'type': 'Shots on Goal', 'home': '10', 'away': '7'}, {'type': 'Shots off Goal', 'home': '6', 'away': '3'}, {'type': 'Blocked Shots', 'home': '6', 'away': '2'}, {'type': 'Corner Kicks', 'home': '7', 'away': '2'}, {'type': 'Offsides', 'home': '0', 'away': '2'}, {'type': 'Goalkeeper Saves', 'home': '5', 'away': '9'}, {'type': 'Fouls', 'home': '12', 'away': '13'}, {'type': 'Yellow Cards', 'home': '4', 'away': '4'}, {'type': 'Total Passes', 'home': '421', 'away': '271'}, {'type': 'Tackles', 'home': '14', 'away': '24'}, {'type': 'Attacks', 'home': '97', 'away': '86'}, {'type': 'Dangerous Attacks', 'home': '43', 'away': '46'}]" 2,1,2,"[{'type': 'Ball Possession', 'home': '48%', 'away': '52%'}, {'type': 'Goal Attempts', 'home': '16', 'away': '14'}, {'type': 'Shots on Goal', 'home': '4', 'away': '6'}, {'type': 'Shots off Goal', 'home': '6', 'away': '5'}, {'type': 'Blocked Shots', 'home': '6', 'away': '3'}, {'type': 'Corner Kicks', 'home': '4', 'away': '4'}, {'type': 'Offsides', 'home': '2', 'away': '6'}, {'type': 'Goalkeeper Saves', 'home': '4', 'away': '3'}, {'type': 'Fouls', 'home': '11', 'away': '14'}, {'type': 'Yellow Cards', 'home': '2', 'away': '7'}, {'type': 'Total Passes', 'home': '594', 'away': '643'}, {'type': 'Tackles', 'home': '24', 'away': '16'}, {'type': 'Attacks', 'home': '144', 'away': '130'}, {'type': 'Dangerous Attacks', 'home': '77', 'away': '36'}]"
非常感謝您提供的各種幫助。 請告訴我如何將這個 json 數據集展平到同一級別。 我是新手愛好者,如果我能提高我的問題的質量。 請不要猶豫給我提示。
下面顯示了如何轉換 dataframe 中的給定行。 您需要迭代並創建 dataframe ,如下所示。
import json
import pandas as pd
sample_row = [{'type': 'Ball Possession', 'home': '44%', 'away': '56%'}, {'type': 'Goal Attempts', 'home': '15', 'away': '6'}, {'type': 'Shots on Goal', 'home': '5', 'away': '5'}, {'type': 'Shots off Goal', 'home': '9', 'away': '1'}, {'type': 'Blocked Shots', 'home': '1', 'away': '0'}, {'type': 'Corner Kicks', 'home': '3', 'away': '3'}, {'type': 'Offsides', 'home': '4', 'away': '2'}, {'type': 'Goalkeeper Saves', 'home': '4', 'away': '2'}, {'type': 'Fouls', 'home': '11', 'away': '10'}, {'type': 'Yellow Cards', 'home': '2', 'away': '4'}, {'type': 'Total Passes', 'home': '382', 'away': '503'}, {'type': 'Tackles', 'home': '13', 'away': '16'}, {'type': 'Attacks', 'home': '97', 'away': '136'}, {'type': 'Dangerous Attacks', 'home': '45', 'away': '63'}]
js = json.dumps(sample_row)
df = pd.json_normalize(json.loads(js))
df['match_hometeam_score'] = [3] * len(df)
df['match_awayteam_score'] = [1] * len(df)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.