[英]Is there a way to un-nesting a pandas dataframe in a python3 jupyter notebook?
I am importing a json file into a python3 jupyter notebook. 我正在将json文件导入python3 jupyter笔记本中。 The json file has the format
json文件具有以下格式
I am importing the json file in this way: 我以这种方式导入json文件:
import pandas as pd
import numpy as np
import json
from pandas.io.json import json_normalize
with open("rooms.json") as file:
data = json.load(file)
df = json_normalize(data['rooms'])
I am now trying to plot each of the 6 dimensions against each other in a matrix-like format, with 36 total graphs. 我现在正尝试以矩阵状格式绘制6个维度中的每个维度,总共绘制36张图。
I am trying to this the following way: 我正在尝试通过以下方式:
col_features = ['fromBathroom', 'fromParking', 'dfromBathroom', 'dfromParking', 'depth', 'area']
pd.plotting.scatter_matrix(df[col_features], alpha = .2, figsize = (14,8))
This does not work, as I am getting an error that reads: KeyError: "['fromBathroom' 'fromParking' 'dfromBathroom' 'dfromParking'] not in index" 这不起作用,因为我收到一条错误,内容为:KeyError:“ ['from'Bathroom''fromParking''dfromBathroom''dfromParking']不在索引中”
This is because those features are nested in 'turns' and 'distances' in the json file. 这是因为这些功能嵌套在json文件的“转弯”和“距离”中。 Is there a way to un-nest these features so that I can index into the dataframe the same way I can for depth and area to get the values?
有没有一种方法可以取消嵌套这些功能,以便我可以像深度和面积那样获取值的方式索引到数据框中?
Thank you for any insights. 感谢您的任何见解。
Maybe you could extract df1 = df['turns']
, df2 = df['distances']
and df3 = df['areas', 'depth]
and then do a df4 = pd.concat([df1, df2, df3], join='inner', axis=1)
see pandas doc 也许您可以提取
df1 = df['turns']
, df2 = df['distances']
和df3 = df['areas', 'depth]
然后执行df4 = pd.concat([df1, df2, df3], join='inner', axis=1)
参见pandas doc
or directly : pd.concat([df['turns'], df['distances'], df['areas', 'depth]], join='inner', axis=1)
或直接:
pd.concat([df['turns'], df['distances'], df['areas', 'depth]], join='inner', axis=1)
EDIT : 编辑:
I tried something, I hope it is what you are looking for : 我尝试了一些东西,希望它是您要寻找的东西:
link to the image with the code and the results I get with Jupyter 链接到带有代码和通过Jupyter获得的结果的图像
df1 = df['turns']
df2 = df['distances']
df3 = pd.DataFrame(df['depth'])
df4 = pd.DataFrame(df['area'])
df_recomposed = pd.concat([df1, df2, df3, df4], join='inner', axis=1)
or Pandas - How to flatten a hierarchical index in columns 或Pandas-如何展平列中的层次结构索引
where df.columns = [' '.join(col).strip() for col in df.columns.values]
should be what you are looking for df.columns = [' '.join(col).strip() for col in df.columns.values]
应该是您要查找的内容
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.