繁体   English   中英

嵌套可变长度列表字典到DF

[英]dictionary of nested variable length lists to pandas DF

我有一本字典,看起来像这样:

{ I1 : [['A',1],['B',2],['C',3]],
  I2 : [['B',2],['D',4]],
  I3 : [['A',2],['E',5]]
}

即我有一个索引(键),然后是可变数量的对。 我想创建一个具有与字典相同索引的pandas数据框,其中列是列表对的第一个值,值是列表对的第二个值,NaN会填充缺失值(即第I2行将包含NaN(在“ A”列中)。 有一个巧妙的方法可以做到这一点吗?

import pandas as pd

a={ 'I1' : [['A',1],['B',2],['C',3]],
    'I2' : [['B',2],['D',4]],
    'I3' : [['A',2],['E',5]]
  }


# create a list of dictionary from each rows

'''
The map function is used to convert say 'I3'
to integer 3, which can then be used to sort on
This is done because sorting  merely by the index string will lead to
say 'I15' to appear before 'I4'(assuming a more general 
case of you having more than just 3 indexes)
'''
# the sorted function is used because the order of keys is not maintained in a dict

row_dict = [dict(a[idx]) for _,idx in sorted(zip(map(lambda x: int(x[1:]),a),a))]

df=pd.DataFrame(row_dict)


    A   B   C   D   E
0   1   2   3 NaN NaN
1 NaN   2 NaN   4 NaN
2   2 NaN NaN NaN   5

假设I1,I2,I3是字符串,则可以使用以下命令:

import pandas as pd

a={ 'I1' : [['A',1],['B',2],['C',3]],
  'I2' : [['B',2],['D',4]],
  'I3' : [['A',2],['E',5]]
}

df=pd.DataFrame([dict(val) for key,val in a.items()])
print df

    A   B   C   D   E
0   1   2   3 NaN NaN
1   2 NaN NaN NaN   5
2 NaN   2 NaN   4 NaN

你可以使用@ manu190455解决方案,但传递给前对其进行排序pandas.DataFramesortedkey参数:

d = { 'I1' : [['A',1],['B',2],['C',3]],
    'I2' : [['B',2],['D',4]],
    'I3' : [['A',2],['E',5]]}

sorted_d = sorted(d.items(), key = lambda x: x[0])

In [263]: sorted_d
Out[263]:
[('I1', [['A', 1], ['B', 2], ['C', 3]]),
 ('I2', [['B', 2], ['D', 4]]),
 ('I3', [['A', 2], ['E', 5]])]

df = pd.DataFrame([dict(val) for key, val in sorted_d])

In [265]: df
Out[265]:
    A   B   C   D   E
0   1   2   3 NaN NaN
1 NaN   2 NaN   4 NaN
2   2 NaN NaN NaN   5

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM