将嵌套的 json 转换为 Python 中的 pandas 数据帧

Question

I have a nested data frame in JSON.我在 JSON 中有一个嵌套数据框。 I have no problem with taking a data frame that isn't nested and converting into pandas data frame.我可以将未嵌套的数据框转换为 pandas 数据框。

What I am having issues is when there are multiple levels of the data frame and I need to write independent records for each of the json entries.我遇到的问题是当数据帧有多个级别时，我需要为每个 json 条目编写独立的记录。

{
  'type': 'text1',
  'key': ['key1'],
  
},  
{
  'type': 'text2',
  'key': ['key1', 'key2'], 
}, 
 'type': 'text3',
 'key': 'key', 
}

I used the following code to write this into a data frame.我使用以下代码将其写入数据帧。

 df = pd.DataFrame.from_dict(json)

Unfortunately for each of the entries, I have to include a record.不幸的是，对于每个条目，我都必须包含一个记录。 So if key has 2 elements in the array, 2 entries will need to be created.因此，如果 key 在数组中有 2 个元素，则需要创建 2 个条目。 And an additional column (key index) will be created.并且将创建一个附加列（键索引）。 So what I am trying to get is something similar to below.所以我想要得到的是类似于下面的东西。

Any help would be greatly appreciated on this as I have been stuck on this for a while!任何帮助将不胜感激，因为我已经坚持了一段时间！

Answer 1

Use explode :使用explode ：

json = [{'type': 'text1', 'key': ['key1']},
        {'type': 'text2', 'key': ['key1', 'key2']},
        {'type': 'text3', 'key': 'key'}]

df = pd.DataFrame(json).explode('key') \
       .assign(key_index=lambda x: x.groupby(level=0).cumcount())
print(df)

# Output
    type   key  key_index
0  text1  key1          0
1  text2  key1          0
1  text2  key2          1
2  text3   key          0

将嵌套的 json 转换为 Python 中的 pandas 数据帧

问题描述

1 个解决方案

解决方案1
1 已采纳 2022-01-18 19:44:00

将嵌套的 json 转换为 Python 中的 pandas 数据帧

问题描述

1 个解决方案

解决方案1 1 已采纳 2022-01-18 19:44:00

解决方案1
1 已采纳 2022-01-18 19:44:00