简体   繁体   中英

Python Extract Nested JSON Values

I tried to extract some nested values from a JSON from a URL link. I was able to do that but think my process was not efficient and feels there's a better way to achieve what I did.

The JSON file looks like this after loading:

{
  "name": "English Premier League 2014/15",
  "rounds": [
    {
      "name": "Matchday 1",
      "matches": [
        {
          "date": "2014-08-16",
          "team1": {
            "key": "manutd",
            "name": "Manchester United",
            "code": "MUN"
          },
          "team2": {
            "key": "swansea",
            "name": "Swansea",
            "code": "SWA"
          },
          "score1": 1,
          "score2": 2
        } ...
        
}

My goal is to get the key(eg "manutd") and score and their values for both teams. The code below is how I achieved that;

team_recs = []
team1 = []
team2 = []
score1 = []
score2 = []
ind = range(38)

for rec in df['rounds']:
    trec = rec['matches']
    team_recs.append(trec)

for i in ind:
    for j in team_recs[i]:
        team1.append(j['team1']['key'])
        team2.append(j['team2']['key'])
        score1.append(j['score1'])
        score2.append(j['score2'])


data = pd.DataFrame([team1,score1,team2,score2]).T
data = data.rename(columns={0: 'team1', 1: 'score1', 2: 'team2', 3:'score2'})
data = data.astype({'score1': 'int32', 'score2':'int32'})

output(data);

    | team1     | score1 | team2     |score2 |
----|-----------|--------|-----------|-------|
| 0 | manutd    | 1      | swansea   | 2     |
| 1 | leicester | 2      | everton   | 2     |
| 2 | qpr       | 0      | hull      | 1     |
| 3 | stoke     | 0      | astonvilla| 1     |

This is a data on football matches in the EPL, there are 38 matches played in a season.

As QuantumDreamer suggested in their comment, you can use json_normalize , as below:

import pandas as pd
import json

data = """{
  "name": "English Premier League 2014/15",
  "rounds": [
    {
      "name": "Matchday 1",
      "matches": [
        {
          "date": "2014-08-16",
          "team1": {
            "key": "manutd",
            "name": "Manchester United",
            "code": "MUN"
          },
          "team2": {
            "key": "swansea",
            "name": "Swansea",
            "code": "SWA"
          },
          "score1": 1,
          "score2": 2
        }
      ]
    }
  ]     
}"""

df = pd.json_normalize(
    json.loads(data),
    record_path=["rounds", "matches"],
    meta=[["rounds", "name"]]
)

Which results in

>>> print(df)
         date  score1  score2 team1.key         team1.name team1.code  \
0  2014-08-16       1       2    manutd  Manchester United        MUN   

  team2.key team2.name team2.code rounds.name  
0   swansea    Swansea        SWA  Matchday 1

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM