简体   繁体   中英

Issue in flattening Json file in python

I have a json file that contains the record of goals scored at the minutes of the game. I tried to flatten it using the following code:

data_Loc ='Season Fixtures.json'
with open(data_Loc) as data_file:    
    d= json.load(data_file)  
df_Fixtures = pd.io.json.json_normalize(d,'matches')

The output is as follow:

请看这里

Then I convert goals to series using:

df_goal = df_Fixtures.goals.apply(pd.Series)

and the output is as follow:

在此处输入图片说明

It includes another dictionary in the columns.

How can I convert goals column directly to periods?

The input data file can be downloaded from here

Can anyone advise me how to flatten to the last part of goal column? That means goals columns will be converted in to multiple columns such as Period, minutes, playerId,TeamId, Type.

To include matchId , I create a new data frame as follow and combine with previous data frame advised by Jez as follow:

df_MatchID = pd.io.json.json_normalize(d,'matches')
df_MatchID = df_MatchID[['matchId']]
df_Fixtures_details = pd.concat([df_MatchID,df_Fixtures],axis =1)

The output is as follow ( Other columns shows NaN:) 在此处输入图片说明

Thanks Zep

I believe you need:

df_Fixtures = pd.io.json.json_normalize(d, ['matches','goals'])

print (df_Fixtures.head())
   minute      period  playerId  teamId  type
0      14   FirstHalf    206314    3161  goal
1      72  SecondHalf     20661    3204  goal
2      78  SecondHalf    206314    3161  goal
3       3   FirstHalf    300830    3187  goal
4      72  SecondHalf     21385    3187  goal

EDIT:

data_Loc ='Season Fixtures.json'
with open(data_Loc) as data_file:    
    d= json.load(data_file)['matches'] 

df = pd.io.json.json_normalize(d, ['goals'],'matchId')

print (df.head())
   minute      period  playerId  teamId  type  matchId
0      14   FirstHalf    206314    3161  goal  2759508
1      72  SecondHalf     20661    3204  goal  2759508
2      78  SecondHalf    206314    3161  goal  2759508
3       3   FirstHalf    300830    3187  goal  2759507
4      72  SecondHalf     21385    3187  goal  2759507

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM