簡體   English   中英

Pandas MultiIndex:如何刪除特定列中具有零個正值的整個級別?

[英]Pandas MultiIndex: How to remove entire level that has zero positive values in specific column?

我有這個Pandas MultiIndex:

在此處輸入圖片說明

如果在INFORMATION_SURPLUS_PCT列中沒有正值,有沒有簡單的方法,我可以刪除任何級別。 在示例圖像中,這將完全刪除AAPL級別。

謝謝

我更改了DataFrame以便進行更好的測試:

print df
           INFORMATION_SURPLUS_DIFF  INFORMATION_SURPLUS_PCT
SYMBOL                                                      
AAL    0                   0.000000                 0.000000
       1                  -0.010875                 0.000000
       2                  -0.003659                 0.000000
       3                   0.007364                 0.000000
       4                  -0.018224                 0.000000
       5                   0.015290                 0.000000
       6                   0.067060                27.360990
       7                   0.028754                11.732043
       8                   0.021312                 0.000000
       9                   0.083284                33.980826
       10                  0.073214                29.872141
AAPL   0                   0.000000                 0.000000
       1                  -0.032254                 0.000000
       2                  -0.050695                 0.000000
       3                  -0.009713                 0.000000
       4                  -0.000673                 0.000000
       5                  -0.021018                 0.000000
AAPL1  6                  -0.061908                 0.000000
       7                  -0.029942                -1.000000
       8                  -0.074356                -1.000000
       9                  -0.154641                 0.000000
       10                 -0.137246                 0.000000
ADBE   0                   0.000000                 2.000000
       1                   0.000000                 0.000000
       2                   0.000000                 0.000000
idx=df[~(df['INFORMATION_SURPLUS_PCT']<=0).values].index.get_level_values('SYMBOL').unique()
print idx
['AAL' 'ADBE']

print df.loc[(idx, slice(None)),:]
           INFORMATION_SURPLUS_DIFF  INFORMATION_SURPLUS_PCT
SYMBOL                                                      
AAL    0                   0.000000                 0.000000
       1                  -0.010875                 0.000000
       2                  -0.003659                 0.000000
       3                   0.007364                 0.000000
       4                  -0.018224                 0.000000
       5                   0.015290                 0.000000
       6                   0.067060                27.360990
       7                   0.028754                11.732043
       8                   0.021312                 0.000000
       9                   0.083284                33.980826
       10                  0.073214                29.872141
ADBE   0                   0.000000                 2.000000
       1                   0.000000                 0.000000
       2                   0.000000                 0.000000

說明:

#use inverted by(~) condition (<= 0) for column INFORMATION_SURPLUS_PCT
print ~(df['INFORMATION_SURPLUS_PCT'] <= 0)
SYMBOL    
AAL     0     False
        1     False
        2     False
        3     False
        4     False
        5     False
        6      True
        7      True
        8     False
        9      True
        10     True
AAPL    0     False
        1     False
        2     False
        3     False
        4     False
        5     False
AAPL1   6     False
        7     False
        8     False
        9     False
        10    False
ADBE    0      True
        1     False
        2     False
Name: INFORMATION_SURPLUS_PCT, dtype: bool
#find all rows which contains at least one positive values in column INFORMATION_SURPLUS_PCT
print df[~(df['INFORMATION_SURPLUS_PCT'] <= 0).values]
           INFORMATION_SURPLUS_DIFF  INFORMATION_SURPLUS_PCT
SYMBOL                                                      
AAL    6                   0.067060                27.360990
       7                   0.028754                11.732043
       9                   0.083284                33.980826
       10                  0.073214                29.872141
ADBE   0                   0.000000                 2.000000
#find all index value in level SYMBOL
print df[~(df['INFORMATION_SURPLUS_PCT'] <= 0).values].index.get_level_values('SYMBOL')
Index([u'AAL', u'AAL', u'AAL', u'AAL', u'ADBE'], dtype='object', name=u'SYMBOL')

#get unique values of index
idx = df[~(df['INFORMATION_SURPLUS_PCT'] <= 0).values].index.get_level_values('SYMBOL').unique()
print idx
['AAL' 'ADBE']

#select all unique values
print df.loc[(idx, slice(None)),:]

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM