在多個維度上將具有MultiIndex的Pandas系列切片的有效方法是什么？

Question

我迷失了ix，xs，MultiIndex，get_level_values和其他熊貓的海洋。

我有一個帶有三級多索引的系列。 一種基於不同級別的值對我的系列進行切片的有效方法是什么？

我的系列如下所示：

days  id                      start_date
0     S0036-4665(00)04200108  2013-05-18      1
3     S0036-4665(00)04200108  2013-05-18      1
5     S0036-4665(00)04200108  2013-05-18      3
13    S0036-4665(00)04200108  2013-05-18      1
19    S0036-4665(00)04200108  2013-05-18      1
39    S0036-4665(00)04200108  2013-05-18      1
...

顯然，id和start_date的值隨着成名而改變

我希望能夠基於以下內容進行切片：-數字范圍內的天數-特定集內的ID-特定日期范圍內的start_date

到目前為止，我找到了這個解決方案，它建議使用df[df.index.get_level_values('a').isin([5, 7, 10, 13])] ，並且我發現可以這樣做：

s.select(lambda x: x[0] < 20 and (x[1] in set('some id', 'other id') ))

這些都是最好的解決方案嗎？ 我認為我應該能夠對xs或ix進行處理，但是前者似乎只能讓您按特定值進行過濾，而后者只能索引系列中的位置？

Answer 1

這是一個例子。 這需要當前的主機，並且將在0.14中可用。 文件在這里： http : //pandas-docs.github.io/pandas-docs-travis/indexing.html#multiindexing-using-slicers

創建一個多索引（這恰好是輸入的笛卡爾積，但這不是必需的）

In [28]: s = Series(np.arange(27),
               index=MultiIndex.from_product(
                     [[1,2,3],
                      ['foo','bar','bah'],
                      date_range('20130101',periods=3)])
                    ).sortlevel()

始終確保您已完全排序

In [29]: s.index.lexsort_depth
Out[29]: 3

In [30]: s
Out[30]: 
1  bah  2013-01-01     6
        2013-01-02     7
        2013-01-03     8
   bar  2013-01-01     3
        2013-01-02     4
        2013-01-03     5
   foo  2013-01-01     0
        2013-01-02     1
        2013-01-03     2
2  bah  2013-01-01    15
        2013-01-02    16
        2013-01-03    17
   bar  2013-01-01    12
        2013-01-02    13
        2013-01-03    14
   foo  2013-01-01     9
        2013-01-02    10
        2013-01-03    11
3  bah  2013-01-01    24
        2013-01-02    25
        2013-01-03    26
   bar  2013-01-01    21
        2013-01-02    22
        2013-01-03    23
   foo  2013-01-01    18
        2013-01-02    19
        2013-01-03    20
dtype: int64

這有助於定義以減少廢語（這將水平組合在一起形成單個軸）

In [33]: idx = pd.IndexSlice

選擇我，其中級別0是2，級別1是bar或foo

In [31]: s.loc[idx[[2],['bar','foo']]]
Out[31]: 
2  bar  2013-01-01    12
        2013-01-02    13
        2013-01-03    14
   foo  2013-01-01     9
        2013-01-02    10
        2013-01-03    11
dtype: int64

與上述相同，但級別2等於20130102

In [32]: s.loc[idx[[2,3],['bar','foo'],'20130102']]
Out[32]: 
2  bar  2013-01-02    13
   foo  2013-01-02    10
3  bar  2013-01-02    22
   foo  2013-01-02    19
dtype: int64

這是一個使用布爾索引器而不是級別索引器的示例。

In [43]: s.loc[idx[[2,3],['bar','foo'],s<20]]
Out[43]: 
2  bar  2013-01-01    12
        2013-01-02    13
        2013-01-03    14
   foo  2013-01-01     9
        2013-01-02    10
        2013-01-03    11
3  foo  2013-01-01    18
        2013-01-02    19
dtype: int64

這是一個省略某些級別的示例（請注意，此處未使用idx ，因為它們本質上與Series等效；在索引DataFrame時更有用）

In [47]: s.loc[:,['bar','foo'],'20130102']
Out[47]: 
1  bar  2013-01-02     4
   foo  2013-01-02     1
2  bar  2013-01-02    13
   foo  2013-01-02    10
3  bar  2013-01-02    22
   foo  2013-01-02    19
dtype: int64

在多個維度上將具有MultiIndex的Pandas系列切片的有效方法是什么？

問題描述

1 個解決方案

解決方案1
4 已采納 2014-03-20 23:17:46

在多個維度上將具有MultiIndex的Pandas系列切片的有效方法是什么？

問題描述

1 個解決方案

解決方案1 4 已采納 2014-03-20 23:17:46

解決方案1
4 已采納 2014-03-20 23:17:46