简体   繁体   English

Pandas 重置 MultiIndex 的内层

[英]Pandas reset inner level of MultiIndex

I have a DF in the following format:我有以下格式的 DF:

                   col1    col2
ID          Date
 1    1993-12-31      4       6
      1994-12-31      8       5
      1995-12-31      4       7
      1996-12-31      3       3
 2    2000-12-31      7       8
      2001-12-31      5       9
      2002-12-31      8       4

And I want to reset the 'Date' index giving the following:我想重置“日期”索引,给出以下内容:

             col1    col2
ID    Date
 1       0      4       6
         1      8       5
         2      4       7
         3      3       3
 2       0      7       8
         1      5       9
         2      8       4

I thought simply df.reset_index(level='Date', inplace=True, drop=True) would do it, but it does not.我以为只是df.reset_index(level='Date', inplace=True, drop=True)就可以做到,但事实并非如此。

Using set_index and cumcount : 使用set_indexcumcount

tmp = df.reset_index('Date', drop=True)
tmp.set_index(df.groupby(level=0).cumcount().rename('Date'), append=True)

         col1  col2
ID Date
1  0        4     6
   1        8     5
   2        4     7
   3        3     3
2  0        7     8
   1        5     9
   2        8     4

You can groupby ID , then reset the index on each group using apply : 您可以分组ID ,然后使用apply重置每个组的索引:

new_df = (df.groupby(df.index.get_level_values('ID'))
          .apply(lambda x: x.reset_index()).drop(['ID','Date'],1))

new_df.index = new_df.index.rename(['ID','Date'])

>>> new_df
         col1  col2
ID Date            
1  0        4     6
   1        8     5
   2        4     7
   3        3     3
2  0        7     8
   1        5     9
   2        8     4

Using pd.MultiIndex.from_arrays and groupby + cumcount . 使用pd.MultiIndex.from_arraysgroupby + cumcount

df.index = pd.MultiIndex.from_arrays(
    [df.index.get_level_values(0), df.groupby(level=0).cumcount()],
    names=['ID', 'Date'])

df
         col1  col2
ID Date            
1  0        4     6
   1        8     5
   2        4     7
   3        3     3
2  0        7     8
   1        5     9
   2        8     4

This won't generalise to N levels, but there should be a df.index.set_levels equivalent I'm forgetting... 这不会推广到N级,但是应该有一个df.index.set_levels等价我忘了......

New Answer 新答案

Not as cool as the old answer but I'd rather be accurate than cool. 不像旧答案那么酷,但我宁愿准确也不酷。

from collections import defaultdict
from itertools import count
d = defaultdict(count)

lbl = []
for a, *_ in df.index.values:
    lbl.append(next(d[a]))

lvl = pd.RangeIndex(max(lbl) + 1)

df.set_index(df.index.set_labels(lbl, 1).set_levels(lvl, 1))

         col1  col2
ID Date            
1  0        4     6
   1        8     5
   2        4     7
   3        3     3
2  0        7     8
   1        5     9
   2        8     4

OLD ANSWER 老答复

Do Not Use 不使用

I misread the question. 我误解了这个问题。 I didn't see that the new index needed to reset for every group. 我没有看到新索引需要为每个组重置。

Hopefully useful to someone. 希望对某人有用。

You can use pandas.MultiIndex.set_levels 您可以使用pandas.MultiIndex.set_levels

n = 1
lvl = df.index.levels[n]
new_lvl = pd.RangeIndex(len(lvl))
new_idx = df.index.set_levels(new_lvl, n)
df.set_index(new_idx)

         col1  col2
ID Date            
1  0        4     6
   1        8     5
   2        4     7
   3        3     3
2  4        7     8
   5        5     9
   6        8     4

One-line 一条线

Yay! 好极了! \\o/

df.set_index(df.index.set_levels(pd.RangeIndex(len(df.index.levels[1])), 1))

         col1  col2
ID Date            
1  0        4     6
   1        8     5
   2        4     7
   3        3     3
2  4        7     8
   5        5     9
   6        8     4

In place 到位

df.index.set_levels(pd.RangeIndex(len(df.index.levels[1])), 1, inplace=True)
df

         col1  col2
ID Date            
1  0        4     6
   1        8     5
   2        4     7
   3        3     3
2  4        7     8
   5        5     9
   6        8     4

Try this:尝试这个:

df.groupby(level=0).apply(lambda _group:_group.reset_index())

*** vrsions warning : ***版本警告

  • the following behavior was tested on pandas version: "1.1.2"以下行为在 pandas 版本上进行了测试: “1.1.2”

  • according to Pandas - Release notes :根据Pandas - 发行说明

    -> it seem that from version 1.3.0 may be a fix that could effect this method, see Bug-Fix -> 似乎从版本1.3.0开始可能会影响此方法,请参阅Bug-Fix

Example:例子:

let's create MultiIndex df by concatenate dictionary with 2 df, such as the key of each level will be appended into the index level让我们通过将字典与 2 个 df 连接来创建 MultiIndex df,例如每个级别的键将附加到索引级别

import pandas as pd
import numpy as np

raw_df = pd.concat({'First':pd.DataFrame(np.random.rand(4,4),index=range(4)),
                    'Second':pd.DataFrame(np.random.rand(4,4),index=range(41,45))})

在此处输入图像描述

result:结果:

result_df = raw_df.groupby(level=0).apply(lambda _group:_group.reset_index(drop=True))

在此处输入图像描述

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM