用多索引迭代连接熊猫数据框

Question

I am iteratively processing a couple of "groups" and I would like to add them together to a dataframe with every group being identified by a 2nd level index. 我正在迭代处理几个“组”，我想将它们一起添加到一个数据帧中，每个组都由第二级索引标识。

This: 这个：

print pd.concat([df1, df2, df3], keys=["A", "B", "C"])

was suggested to me - but it doesn't play well with iteration. 是向我建议的-但在迭代中效果不佳。

I am currently doing 我目前正在做

data_all = pd.DataFrame([])
    for a in a_list:
        group = some.function(a, etc)
        group = group.set_index(['CoI'], append=True, drop=True)
        group = group.reorder_levels(['CoI','oldindex'])
        data_all = pd.concat([data_all, group], ignore_index=False)

But the last line totally destroys my multi-index and I cannot reconstruct it. 但是最后一行完全破坏了我的多索引，因此我无法重构它。

Can you give me a hand? 你能帮我个忙吗？

Answer 1

Should be able just make data_all a list and concatenate once at the end: 应该可以只将data_all列表并在末尾连接一次：

data_all = []
for a in a_list:
    group = some.function(a, etc)
    group = group.set_index(['CoI'], append=True, drop=True)
    group = group.reorder_levels(['CoI','oldindex'])
    data_all.append(group)

data_all = pd.concat(data_all, ignore_index=False)

Also keep in mind that pandas' concat works with iterators. 还请记住，pandas的concat可与迭代器一起使用。 Something like yield group may be more efficient than appending to a list each time. 诸如yield group类的东西可能比每次附加到列表都更有效。 I haven't profiled it though! 我还没有描述它！

用多索引迭代连接熊猫数据框

问题描述

1 个解决方案

解决方案1
5 2013-11-21 04:05:19

用多索引迭代连接熊猫数据框

问题描述

1 个解决方案

解决方案1 5 2013-11-21 04:05:19

解决方案1
5 2013-11-21 04:05:19