简体   繁体   中英

Dictionary data is not seperated into columns in Pandas DataFrame

I have created a variable that stores my json data. It looks like this:

datasett = '''

{
  "data": {
    "trafficRegistrationPoints": [
      {
        "id": "99100B1687283",
        "name": "Menstad sykkeltellepunkt",
        "location": {
          "coordinates": {
            "latLon": {
              "lat": 59.173876,
              "lon": 9.641772
            }
          }
        }
      },
      {
        "id": "11101B1800681",
        "name": "Garpa - sykkel",
        "location": {
          "coordinates": {
            "latLon": {
              "lat": 63.795114,
              "lon": 11.494511
            }
          }
        }
      },
      {
        "id": "30961B1175469",
        "name": "STENMALEN-SYKKEL",
        "location": {
          "coordinates": {
            "latLon": {
              "lat": 59.27665,
              "lon": 10.411814
            }
          }
        }
      },
      {
        "id": "53749B1700621",
        "name": "TUNEVANNET SYKKEL",
        "location": {
          "coordinates": {
            "latLon": {
              "lat": 59.292846,
              "lon": 11.084058
            }
          }
        }
      },
      {
        "id": "80565B1689290",
        "name": "Nenset sykkeltellepunkt",
        "location": {
          "coordinates": {
            "latLon": {
              "lat": 59.168377,
              "lon": 9.634257
            }
          }
        }
      },
      {
        "id": "24783B2045151",
        "name": "Orstad sykkel- begge retn.",
        "location": {
          "coordinates": {
            "latLon": {
              "lat": 58.798377,
              "lon": 5.72743
            }
          }
        }
      },
      {
        "id": "46418B2616452",
        "name": "Elgeseter bru sykkel øst",
        "location": {
          "coordinates": {
            "latLon": {
              "lat": 63.425015,
              "lon": 10.393928
            }
          }
        }
      },
      {
        "id": "35978B1700571",
        "name": "Tune kirke nord",
        "location": {
          "coordinates": {
            "latLon": {
              "lat": 59.292626,
              "lon": 11.084066
            }
          }
        }
      },
      {
        "id": "21745B1996708",
        "name": "Munkedamsveien Sykkel",
        "location": {
          "coordinates": {
            "latLon": {
              "lat": 59.911198,
              "lon": 10.725568
            }
          }
        }
      },
      {
        "id": "33443B2542097",
        "name": "KANALBRUA-SYKKEL",
        "location": {
          "coordinates": {
            "latLon": {
              "lat": 59.261823,
              "lon": 10.416349
            }
          }
        }
      },
      {
        "id": "77570B384357",
        "name": "HAVRENESVEGEN (SYKKEL)",
        "location": {
          "coordinates": {
            "latLon": {
              "lat": 61.598202,
              "lon": 5.016999
            }
          }
        }
      },
      {
        "id": "95959B971385",
        "name": "JELØGATA SYKKEL",
        "location": {
          "coordinates": {
            "latLon": {
              "lat": 59.43385,
              "lon": 10.65388
            }
          }
        }
      },
      {
        "id": "61523B971803",
        "name": "ST.HANSFJELLET SYKKEL",
        "location": {
          "coordinates": {
            "latLon": {
              "lat": 59.218978,
              "lon": 10.93455
            }
          }
        }
      },

          }
        }
      }
    ]
  }
}
'''

Next, I have used json.loads() to turn it into a dictionary in Python, using the following code:

dict = json.loads(datasett)

Because the result I get is a nested dictionary,I we want to move further into the nest.

movedDict = dict['data']

I then want to this into a Pandas DataFrame

df = pd.DataFrame.from_dict(movedDict)

However, when I print this. The data is not seperated into unique columns. What do I do wrong?

You can use json_normalize here, I also removed some extra } from your JSON:

data = json.loads(datasett)
df = pd.json_normalize(data, record_path=['data', 'trafficRegistrationPoints'])
print(df)

               id                        name  location.coordinates.latLon.lat  location.coordinates.latLon.lon
0   99100B1687283    Menstad sykkeltellepunkt                        59.173876                         9.641772
1   11101B1800681              Garpa - sykkel                        63.795114                        11.494511
2   30961B1175469            STENMALEN-SYKKEL                        59.276650                        10.411814
3   53749B1700621           TUNEVANNET SYKKEL                        59.292846                        11.084058
4   80565B1689290     Nenset sykkeltellepunkt                        59.168377                         9.634257
5   24783B2045151  Orstad sykkel- begge retn.                        58.798377                         5.727430
6   46418B2616452    Elgeseter bru sykkel øst                        63.425015                        10.393928
7   35978B1700571             Tune kirke nord                        59.292626                        11.084066
8   21745B1996708       Munkedamsveien Sykkel                        59.911198                        10.725568
9   33443B2542097            KANALBRUA-SYKKEL                        59.261823                        10.416349
10   77570B384357      HAVRENESVEGEN (SYKKEL)                        61.598202                         5.016999
11   95959B971385             JELØGATA SYKKEL                        59.433850                        10.653880
12   61523B971803       ST.HANSFJELLET SYKKEL                        59.218978                        10.934550

when use from_dict the dict should look like this:

data = {'col_1': [3, 2, 1, 0], 'col_2': ['a', 'b', 'c', 'd']}
pd.DataFrame.from_dict(data)
   col_1 col_2
0      3     a
1      2     b
2      1     c
3      0     d

in your case:

data = {'trafficRegistrationPoints':[.....]}

save the 'trafficRegistrationPoints' as a list and then create the dataFrame

The values for the data key in your dict are not individual dicts but rather a list of dicts under trafficRegistrationPoints key, so you need to move further into that key:

df = pd.DataFrame.from_dict(movedDict['trafficRegistrationPoints'])

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM