[英]Access last elements of inner multiindex level in pandas dataframe
In a multi index
pandas dataframe I want to access the last element of the second index for all values of the first index. 在
multi index
pandas数据帧中,我想访问第一个索引的所有值的第二个索引的最后一个元素。 The number of levels in the second index vary depending on the value of the first index. 第二个索引中的级别数取决于第一个索引的值。 I went through the pandas multi index documentation but could not find anything that does that.
我浏览了pandas多索引文档,但找不到任何可以做到这一点。
For example, for the data frame below: 例如,对于以下数据框:
arrays = [ ['bar', 'bar', 'baz', 'foo', 'foo', 'foo', 'qux'],
['one', 'two', 'one', 'one', 'two', 'three', 'one']]
tuples = list(zip(*arrays))
index = pd.MultiIndex.from_tuples(tuples, names=['first', 'second'])
df = pd.DataFrame(np.random.randn(7, 3), index=index, columns=['A', 'B', 'C'])
df
A B C
first second
bar one 0.289163 -0.464633 -0.060487
two 0.224442 0.177609 2.156436
baz one -0.262329 -0.248384 0.925580
foo one 0.051350 0.452014 0.206809
two 2.757255 -0.739196 0.183735
three -0.064909 -0.963130 1.364771
qux one -1.330857 1.881588 -0.262170
I want to get: 我想得到:
A B C
first second
bar two 0.224442 0.177609 2.156436
baz one -0.262329 -0.248384 0.925580
foo three -0.064909 -0.963130 1.364771
qux one -1.330857 1.881588 -0.262170
The dataframes
I am working with have over 10M
lines so I want to avoid explicit looping. 我正在使用的
dataframes
有超过10M
行,所以我想避免显式循环。
Use groupby
with tail
: 使用
groupby
with tail
:
print (df.groupby(level='first').tail(1))
A B C
first second
bar two 0.053054 -0.555819 0.589998
baz one -0.868676 1.293633 1.339474
foo three 0.407454 0.738872 1.811894
qux one -0.346014 -1.491270 0.446772
because last
lost level second
: 因为
last
失去了level second
:
print (df.groupby(level='first').last())
A B C
first
bar 0.053054 -0.555819 0.589998
baz -0.868676 1.293633 1.339474
foo 0.407454 0.738872 1.811894
qux -0.346014 -1.491270 0.446772
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.