在python熊猫中遍历MultiIndex数据

Question

I want to be able to iterate through a pandas DataFrame with grouping on a multi-index. 我希望能够通过对多索引进行分组来遍历pandas DataFrame。 Here, I'd like to be able to process a group of rows in each industry all together. 在这里，我希望能够一起处理每个行业中的一组行。 I load with a multi-index. 我加载了多索引。

from StringIO import StringIO
data = """industry,location,number
retail,brazil,294
technology,china,100
retail,nyc,2913
retail,paris,382
technology,us,2182
"""

df = pd.read_csv(StringIO(data), sep=",", index_col=['industry', 'location'])

So I wish there was something to this effect: 所以我希望能有一些效果：

for industry, rows in df.iter_multiindex():
    for row in rows:
        process_row(row)

Is there such a way to do this? 有这种方法吗？

Answer 1

You can groupby the first level of the multi-index (the industries), and then iterate trough the groups: 您可以按多索引的第一级（行业）分组，然后遍历各组：

In [102]: for name, group in df.groupby(level='industry'):
   .....:     print name, '\n', group, '\n'
   .....:
retail
                   number
industry location
retail   brazil       294
         nyc         2913
         paris        382

technology
                     number
industry   location
technology china        100
           us          2182

group will be each time a dataframe, and you can then iterate through that (with eg for row in group.iterrows() . group每次是一个数据框，然后可以遍历该数据for row in group.iterrows()例如， for row in group.iterrows() 。

But , in most cases such iteration is not needed! 但是，在大多数情况下，不需要这种迭代！ What would process_row entail? process_row需要什么？ Probably you can do this in a vectorized manner, directly on the groupby object. 可能您可以直接在groupby对象上以矢量化方式执行此操作。

Answer 2

not sure why do you want to do this, but you can do it like this: 不确定为什么要这样做，但是可以这样：

for x in df.index:
    print x[0] # industry
    process(df.loc[x]) # row

But it's not how you usually work with DataFrame, you probably want to read about apply() ( Essential Basic Functionality is also really helpful) 但这不是您通常使用DataFrame的方式，您可能想阅读有关apply() （ Essential Basic Functionality也很有帮助）

在python熊猫中遍历MultiIndex数据

问题描述

2 个解决方案

解决方案1
1 已采纳 2014-12-03 20:29:57

解决方案2
0 2014-12-03 19:57:05

在python熊猫中遍历MultiIndex数据

问题描述

2 个解决方案

解决方案1 1 已采纳 2014-12-03 20:29:57

解决方案2 0 2014-12-03 19:57:05

解决方案1
1 已采纳 2014-12-03 20:29:57

解决方案2
0 2014-12-03 19:57:05