简体   繁体   中英

append to level in Multiindex pandas DataFrame

The structure of my Multiindex dataframe looks like this:

                                  close       high        low       open  
   index = (timestamp,key)                                  
(2018-09-10 16:00:00, ask)       1.16023    1.16064    1.16007    1.16046
(2018-09-10 16:00:00, bid)       1.16009    1.16053    1.15992    1.16033
(2018-09-10 16:00:00, volume)  817.00000  817.00000  817.00000  817.00000

For each timestamp there are observartions for bid, ask and the volume.

I am trying to add to the second level of the index (ie [bid,ask,volume]) a "mid" observation by calculating the corresponding (bid + ask)/2.

My desired dataframe should then look like this

                                  close       high        low       open  
   index = (timestamp,key)                                  
(2018-09-10 16:00:00, ask)       1.16023    1.16064    1.16007    1.16046
(2018-09-10 16:00:00, bid)       1.16009    1.16053    1.15992    1.16033
(2018-09-10 16:00:00, volume)  817.00000  817.00000  817.00000  817.00000
(2018-09-10 16:00:00, mid)     1.16016      1.16059    1.15999    1.1604

What's the most efficient way to do this? Can this be done in place?

EDIT:

Printing out head of dataframe to see structure more clearly.

`bid_ask.head(5).to_dict()
Out[3]: 
{'close': {(Timestamp('2018-09-10 16:00:00'), 'ask'): 1.1602300000000001,
  (Timestamp('2018-09-10 16:00:00'), 'bid'): 1.1600900000000001,
  (Timestamp('2018-09-10 16:00:00'), 'volume'): 817.0,
  (Timestamp('2018-09-10 17:00:00'), 'ask'): 1.15977,
  (Timestamp('2018-09-10 17:00:00'), 'bid'): 1.15968},
 'high': {(Timestamp('2018-09-10 16:00:00'), 'ask'): 1.1606399999999999,
  (Timestamp('2018-09-10 16:00:00'), 'bid'): 1.1605300000000001,
  (Timestamp('2018-09-10 16:00:00'), 'volume'): 817.0,
  (Timestamp('2018-09-10 17:00:00'), 'ask'): 1.16039,
  (Timestamp('2018-09-10 17:00:00'), 'bid'): 1.16029},
 'low': {(Timestamp('2018-09-10 16:00:00'), 'ask'): 1.1600699999999999,
  (Timestamp('2018-09-10 16:00:00'), 'bid'): 1.1599200000000001,
  (Timestamp('2018-09-10 16:00:00'), 'volume'): 817.0,
  (Timestamp('2018-09-10 17:00:00'), 'ask'): 1.1596200000000001,
  (Timestamp('2018-09-10 17:00:00'), 'bid'): 1.1595299999999999},
 'open': {(Timestamp('2018-09-10 16:00:00'), 'ask'): 1.16046,
  (Timestamp('2018-09-10 16:00:00'), 'bid'): 1.1603300000000001,
  (Timestamp('2018-09-10 16:00:00'), 'volume'): 817.0,
  (Timestamp('2018-09-10 17:00:00'), 'ask'): 1.1601900000000001,
  (Timestamp('2018-09-10 17:00:00'), 'bid'): 1.1600999999999999}}
 `

I am not entirely sure how your DataFrame is structured but this is the essence

df.loc[('2018-09-10 16:00:00', 'mid'), :] = [1.16016, 1.16059, 1.15999 , 1.1604]

All you need to do is use df.loc and supply a new tuple for the MultiIndex

In my guess I assumed your new MultiIndex entry was ('2018-09-10 16:00:00', 'mid')

Example

In [353]: ref

Out[353]:
       Names  Values
  idx2
1 one      A       5
2 two      B      10

In [354]: ref.loc[(3, 'three'), :] = ['C', 15]

In [355]: ref
Out[355]:
        Names  Values
  idx2
1 one       A     5.0
2 two       B    10.0
3 three     C    15.0

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM