将字典的字典列表转换为DataFrame

Question

I have a list of dictionaries of dictionary looks like: 我有一个字典字典列表，如下所示：

[{'a': 1, 'b': {'c': 1, 'd': 2, 'e': 3}, 'f': 4}, 
 {'a': 2, 'b': {'c': 2, 'd': 3, 'e': 4}, 'f': 3}, 
 {'a': 3, 'b': {'c': 3, 'd': 4, 'e': 5}, 'f': 2}, 
 {'a': 4, 'b': {'c': 4, 'd': 5, 'e': 6}, 'f': 1 }]

and the result should looks like: 结果应如下所示：

     a    c    d    e    f
0    1    1    2    3    4
1    2    2    3    4    3
2    3    3    4    5    2
3    4    4    5    6    1

while the default pd.DataFrame(data) looks like: 而默认的pd.DataFrame(data)看起来像：

     a    b                           f
0    1    {'c': 1, 'd': 2, 'e': 3}    4
1    2    {'c': 2, 'd': 3, 'e': 4}    3
2    3    {'c': 3, 'd': 4, 'e': 5}    2
3    4    {'c': 4, 'd': 5, 'e': 6}    1

How can I do this with pandas? 我该如何用熊猫呢？ Thanks. 谢谢。

Answer 1

you need to convert json to flat data as such: 您需要像这样将json转换为平面数据：

import pandas as pd
from pandas.io.json import json_normalize
data = [{'a': 1, 'b': {'c': 1, 'd': 2, 'e': 3}, 'f': 4}, 
        {'a': 2, 'b': {'c': 2, 'd': 3, 'e': 4}, 'f': 3}, 
        {'a': 3, 'b': {'c': 3, 'd': 4, 'e': 5}, 'f': 2}, 
        {'a': 4, 'b': {'c': 4, 'd': 5, 'e': 6}, 'f': 1 }]

df = pd.DataFrame.from_dict(json_normalize(data), orient='columns')
df

# output:
    a   b.c b.d b.e f
0   1   1   2   3   4
1   2   2   3   4   3
2   3   3   4   5   2
3   4   4   5   6   1

You can rename the columns once it's done.. 完成后，您可以重命名列。

Answer 2

json_normalize is what you're loooking for! json_normalize是您想要的！

import pandas as pd
from pandas.io.json import json_normalize

x = [{'a': 1, 'b': {'c': 1, 'd': 2, 'e': 3}, 'f': 4}, 
 {'a': 2, 'b': {'c': 2, 'd': 3, 'e': 4}, 'f': 3}, 
 {'a': 3, 'b': {'c': 3, 'd': 4, 'e': 5}, 'f': 2}, 
 {'a': 4, 'b': {'c': 4, 'd': 5, 'e': 6}, 'f': 1 }]

sep = '::::' # string that doesn't appear in column names

frame = json_normalize(x, sep=sep)
frame.columns = frame.columns.str.split(sep).str[-1]
print(frame)

Output 产量

   a  c  d  e  f
0  1  1  2  3  4
1  2  2  3  4  3
2  3  3  4  5  2
3  4  4  5  6  1

Answer 3

import pandas as pd
z=[{'a': 1, 'b': {'c': 1, 'd': 2, 'e': 3}, 'f': 4}, 
 {'a': 2, 'b': {'c': 2, 'd': 3, 'e': 4}, 'f': 3}, 
 {'a': 3, 'b': {'c': 3, 'd': 4, 'e': 5}, 'f': 2}, 
 {'a': 4, 'b': {'c': 4, 'd': 5, 'e': 6}, 'f': 1 }]
step1=pd.DataFrame(z)
column_with_sets = 'b'
step2=pd.DataFrame(list(step1[column_with_sets]))
step3=pd.concat([step1[[i for i in step1.columns if column_with_sets 
not in i]], step2],1)
step4=output.reindex_axis(sorted(output.columns), axis=1)

将字典的字典列表转换为DataFrame

问题描述

3 个解决方案

解决方案1
2 2018-05-03 18:06:55

解决方案2
1 已采纳 2018-05-03 18:07:13

解决方案3
0 2018-05-03 18:08:09

将字典的字典列表转换为DataFrame

问题描述

3 个解决方案

解决方案1 2 2018-05-03 18:06:55

解决方案2 1 已采纳 2018-05-03 18:07:13

解决方案3 0 2018-05-03 18:08:09

解决方案1
2 2018-05-03 18:06:55

解决方案2
1 已采纳 2018-05-03 18:07:13

解决方案3
0 2018-05-03 18:08:09