I am importing a json file into a python3 jupyter notebook. The json file has the format
I am importing the json file in this way:
import pandas as pd
import numpy as np
import json
from pandas.io.json import json_normalize
with open("rooms.json") as file:
data = json.load(file)
df = json_normalize(data['rooms'])
I am now trying to plot each of the 6 dimensions against each other in a matrix-like format, with 36 total graphs.
I am trying to this the following way:
col_features = ['fromBathroom', 'fromParking', 'dfromBathroom', 'dfromParking', 'depth', 'area']
pd.plotting.scatter_matrix(df[col_features], alpha = .2, figsize = (14,8))
This does not work, as I am getting an error that reads: KeyError: "['fromBathroom' 'fromParking' 'dfromBathroom' 'dfromParking'] not in index"
This is because those features are nested in 'turns' and 'distances' in the json file. Is there a way to un-nest these features so that I can index into the dataframe the same way I can for depth and area to get the values?
Thank you for any insights.
Maybe you could extract df1 = df['turns']
, df2 = df['distances']
and df3 = df['areas', 'depth]
and then do a df4 = pd.concat([df1, df2, df3], join='inner', axis=1)
see pandas doc
or directly : pd.concat([df['turns'], df['distances'], df['areas', 'depth]], join='inner', axis=1)
EDIT :
I tried something, I hope it is what you are looking for :
link to the image with the code and the results I get with Jupyter
df1 = df['turns']
df2 = df['distances']
df3 = pd.DataFrame(df['depth'])
df4 = pd.DataFrame(df['area'])
df_recomposed = pd.concat([df1, df2, df3, df4], join='inner', axis=1)
or Pandas - How to flatten a hierarchical index in columns
where df.columns = [' '.join(col).strip() for col in df.columns.values]
should be what you are looking for
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.