[英]Select data at a particular level from a MultiIndex
I have the following Pandas Dataframe with a MultiIndex(Z,A): 我有以下带有MultiIndex(Z,A)的Pandas Dataframe:
H1 H2
Z A
0 100 200 0.3112 -0.4197
1 100 201 0.2967 0.4893
2 100 202 0.3084 -0.4873
3 100 203 0.3069 NaN
4 101 203 -0.4956 NaN
Question: How can I select all items with A=203? 问题:如何选择A = 203的所有项目? I tried
df[:,'A']
but it doesn't work. 我试过
df[:,'A']
但它不起作用。 Then I found this in the online documentation so I tried: 然后我在在线文档中找到了这个 ,所以我尝试了:
df.xs(203,level='A')
but I get: 但我得到:
" TypeError: xs() got an unexpected keyword argument 'level'
" “
TypeError: xs() got an unexpected keyword argument 'level'
”
Also I dont see this parameter in the installed doc( df.xs?
): 另外,我在安装的doc(
df.xs?
) df.xs?
不到这个参数:
"Parameters ---------- key : object Some label contained in the index, or partially in a MultiIndex axis : int, default 0 Axis to retrieve cross-section on copy : boolean, default True Whether to make a copy of the data" “参数---------- key:object索引中包含的某些标签,或者部分位于MultiIndex轴中:int,default 0用于检索复制的横截面的轴:boolean,default True是否为数据副本“
Note:I have the development version. 注意:我有开发版本。
Edit: I found this thread . 编辑:我找到了这个帖子 。 They recommend something like:
他们建议像:
df.select(lambda x: x[1]==200, axis=0)
I still would like to know what happened with df.xs with the level parameter or what is the recommended way in the current version. 我仍然想知道df.xs使用level参数发生了什么,或者当前版本中推荐的方式是什么。
The problem lies in my assumption(incorrect) that I was in the dev version while in reality I had 1.6.1, one can check the current installed version with: 问题在于我的假设(不正确),我在开发版本,而实际上我有1.6.1,可以检查当前安装的版本:
import pandas
print pandas.__version__
in the current version df.xs()
with the level parameter works ok. 在当前版本中,带有level参数的
df.xs()
工作正常。
Not a direct answer to the question, but if you want to select more than one value you can use the "slice()" notation: 不是问题的直接答案,但如果要选择多个值,可以使用“slice()”表示法:
import numpy
from pandas import MultiIndex, Series
arrays = [['bar', 'bar', 'baz', 'baz', 'foo', 'foo', 'qux', 'qux'],
['one', 'two', 'one', 'two', 'one', 'two', 'one', 'two']]
tuples = list(zip(*arrays))
index = MultiIndex.from_tuples(tuples, names=['first', 'second'])
s = Series(numpy.random.randn(8), index=index)
In [10]: s
Out[10]:
first second
bar one 0.181621
two 1.016225
baz one 0.716589
two -0.353731
foo one -0.326301
two 1.009143
qux one 0.098225
two -1.087523
dtype: float64
In [11]: s.loc[slice(None)]
Out[11]:
first second
bar one 0.181621
two 1.016225
baz one 0.716589
two -0.353731
foo one -0.326301
two 1.009143
qux one 0.098225
two -1.087523
dtype: float64
In [12]: s.loc[slice(None), "one"]
Out[12]:
first
bar 0.181621
baz 0.716589
foo -0.326301
qux 0.098225
dtype: float64
In [13]: s.loc["bar", slice(None)]
Out[13]:
first second
bar one 0.181621
two 1.016225
dtype: float64
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.