[英]Pandas dataframe reset index count of multiindex
constructed a dataframe by concatenating several dataframes with the keys [a,b,c] as Index通过将几个数据帧与键 [a,b,c] 作为索引连接起来构建一个数据帧
+-------+----------+----------+
| Index | IndexPos | SomeData |
+-------+----------+----------+
| a | 1 | some1 |
| | 2 | some2 |
| | 3 | some3 |
| b | 1 | some1 |
| | 2 | some2 |
| | 3 | some3 |
| c | 1 | some1 |
| | 2 | some2 |
| | 3 | some3 |
+-------+----------+----------+
and now want slice it down to the last 2 elements like:现在想把它切成最后 2 个元素,如:
df.groupby(df.index.levels[0].name).tail(2)
After that I want to recount the remaining elements IndexPos to get this:之后,我想重新计算剩余的元素 IndexPos 以得到这个:
+-------+----------+----------+
| Index | IndexPos | SomeData |
+-------+----------+----------+
| a | 1 | some2 |
| | 2 | some3 |
| b | 1 | some2 |
| | 2 | some3 |
| c | 1 | some2 |
| | 2 | some3 |
+-------+----------+----------+
Is there a way to do this, or do I have to slice it before concatenating?有没有办法做到这一点,或者我必须在连接之前将其切片?
First groupby
on level=0
and get the last two rows from each group using tail
, then using groupby
+ cumcount
on sliced dataframe create a sequential counter for each group and set it as new index at level=1
:首先在
level=0
上groupby
并使用tail
从每个组中获取最后两行,然后在切片数据帧上使用groupby
+ cumcount
为每个组创建一个顺序计数器并将其设置为level=1
新索引:
d = df.groupby(level=0).tail(2)
d = d.droplevel(1).set_index(d.groupby(level=0).cumcount().add(1), append=True)
Or using factorize
in place of groupby
+ cumcount
inspired by @anky's solution:或者使用
factorize
代替groupby
+ cumcount
灵感来自@anky 的解决方案:
d = df.groupby(level=0).tail(2)
d = d.droplevel(1).set_index(d.index.get_level_values(1).factorize()[0] + 1, append=True)
Result:结果:
print(d)
SomeData
Index
a 1 some2
2 some3
b 1 some2
2 some3
c 1 some2
2 some3
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.