[英]Pandas: Convert DataFrame with MultiIndex to dict
Another novice pandas question. 另一个新手熊猫问题。 I want to convert a DataFrame to a dictionary, but in a way different from what is offered by the
DataFrame.to_dict()
function. 我想将DataFrame转换为字典,但其方式与
DataFrame.to_dict()
函数提供的方式不同。 Explanation by example: 按示例说明:
df = pd.DataFrame({'co':['DE','DE','FR','FR'],
'tp':['Lake','Forest','Lake','Forest'],
'area':[10,20,30,40],
'count':[7,5,2,3]})
df = df.set_index(['co','tp'])
Before: 之前:
area count
co tp
DE Lake 10 7
Forest 20 5
FR Lake 30 2
Forest 40 3
After: 后:
{('DE', 'Lake', 'area'): 10,
('DE', 'Lake', 'count'): 7,
('DE', 'Forest', 'area'): 20,
...
('FR', 'Forest', 'count'): 3 }
The dict keys should be tuples consisting of the index row + column title, while the dict values should be the individual DataFrame values. dict键应该是由索引行+列标题组成的元组,而dict值应该是单独的DataFrame值。 For the example above, I managed to find this expression:
对于上面的例子,我设法找到了这个表达式:
after = {(r[0],r[1],c):df.ix[r,c] for c in df.columns for r in df.index}
How can I generalize this code to work for MultiIndices with N levels (instead of 2)? 如何推广此代码以适用于具有N级 (而不是2级)的MultiIndices ?
Answer 回答
Thanks to DSM's answer , I found that I actually just need to use tuple concatenation r+(c,)
and my 2-dimensional loop above becomes N-dimensional: 感谢DSM的回答 ,我发现我实际上只需要使用元组连接
r+(c,)
并且上面的二维循环变为N维:
after = {r + (c,): df.ix[r,c] for c in df.columns for r in df.index}
How about: 怎么样:
>>> df
area count
co tp
DE Lake 10 7
Forest 20 5
FR Lake 30 2
Forest 40 3
>>> after = {r + (k,): v for r, kv in df.iterrows() for k,v in kv.to_dict().items()}
>>> import pprint
>>> pprint.pprint(after)
{('DE', 'Forest', 'area'): 20,
('DE', 'Forest', 'count'): 5,
('DE', 'Lake', 'area'): 10,
('DE', 'Lake', 'count'): 7,
('FR', 'Forest', 'area'): 40,
('FR', 'Forest', 'count'): 3,
('FR', 'Lake', 'area'): 30,
('FR', 'Lake', 'count'): 2}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.