有没有办法在python3 jupyter笔记本中取消嵌套熊猫数据框？

Question

I am importing a json file into a python3 jupyter notebook. 我正在将json文件导入python3 jupyter笔记本中。 The json file has the format json文件具有以下格式

object 宾语
- rooms [26 elements] 房间[26个要素]
  - 0 0
    - turns 转弯
      - fromBathroom 从浴室
      - fromParking 从停车
    - distances 距离
      - dfromBathroom 浴室
      - dfromParking 停车
    - depth 深度
    - area 区域
  - 1 1个
    - .... etc. ....等
- name 名称

I am importing the json file in this way: 我以这种方式导入json文件：

import pandas as pd
import numpy as np
import json
from pandas.io.json import json_normalize

with open("rooms.json") as file:
  data = json.load(file)
df = json_normalize(data['rooms'])

I am now trying to plot each of the 6 dimensions against each other in a matrix-like format, with 36 total graphs. 我现在正尝试以矩阵状格式绘制6个维度中的每个维度，总共绘制36张图。

I am trying to this the following way: 我正在尝试通过以下方式：

col_features = ['fromBathroom', 'fromParking', 'dfromBathroom', 'dfromParking', 'depth', 'area']
pd.plotting.scatter_matrix(df[col_features], alpha = .2, figsize = (14,8))

This does not work, as I am getting an error that reads: KeyError: "['fromBathroom' 'fromParking' 'dfromBathroom' 'dfromParking'] not in index" 这不起作用，因为我收到一条错误，内容为：KeyError：“ ['from'Bathroom''fromParking''dfromBathroom''dfromParking']不在索引中”

This is because those features are nested in 'turns' and 'distances' in the json file. 这是因为这些功能嵌套在json文件的“转弯”和“距离”中。 Is there a way to un-nest these features so that I can index into the dataframe the same way I can for depth and area to get the values? 有没有一种方法可以取消嵌套这些功能，以便我可以像深度和面积那样获取值的方式索引到数据框中？

Thank you for any insights. 感谢您的任何见解。

Answer 1

Maybe you could extract df1 = df['turns'] , df2 = df['distances'] and df3 = df['areas', 'depth] and then do a df4 = pd.concat([df1, df2, df3], join='inner', axis=1) see pandas doc 也许您可以提取df1 = df['turns'] ， df2 = df['distances']和df3 = df['areas', 'depth]然后执行df4 = pd.concat([df1, df2, df3], join='inner', axis=1) 参见pandas doc

or directly : pd.concat([df['turns'], df['distances'], df['areas', 'depth]], join='inner', axis=1) 或直接： pd.concat([df['turns'], df['distances'], df['areas', 'depth]], join='inner', axis=1)

EDIT : 编辑：

I tried something, I hope it is what you are looking for : 我尝试了一些东西，希望它是您要寻找的东西：

link to the image with the code and the results I get with Jupyter 链接到带有代码和通过Jupyter获得的结果的图像

df1 = df['turns']
df2 = df['distances']
df3 = pd.DataFrame(df['depth'])
df4 = pd.DataFrame(df['area'])
df_recomposed = pd.concat([df1, df2, df3, df4], join='inner', axis=1)

or Pandas - How to flatten a hierarchical index in columns 或Pandas-如何展平列中的层次结构索引

where df.columns = [' '.join(col).strip() for col in df.columns.values] should be what you are looking for df.columns = [' '.join(col).strip() for col in df.columns.values]应该是您要查找的内容

有没有办法在python3 jupyter笔记本中取消嵌套熊猫数据框？

问题描述

1 个解决方案

解决方案1
0 已采纳 2019-07-16 19:33:30

有没有办法在python3 jupyter笔记本中取消嵌套熊猫数据框？

问题描述

1 个解决方案

解决方案1 0 已采纳 2019-07-16 19:33:30

解决方案1
0 已采纳 2019-07-16 19:33:30