简体   繁体   中英

Conflicting DatetimeIndex in pandas DataFrame

I have a dataframe with a naive datetime index:

olhcv.index

DatetimeIndex(['1989-01-31', '1989-02-01', '1989-02-02', '1989-02-03',
           '1989-02-06', '1989-02-07', '1989-02-08', '1989-02-09',
           '1989-02-10', '1989-02-13',
           ...
           '2019-03-01', '2019-03-04', '2019-03-05', '2019-03-06',
           '2019-03-07', '2019-03-08', '2019-03-11', '2019-03-12',
           '2019-03-13', '2019-03-14'],
          dtype='datetime64[ns]', length=7606, freq=None) 

I have to remove non-working days based on another Index from another package:

import pandas_market_calendars as mcal

nyse = mcal.get_calendar('NYSE')date=nyse.valid_days(start_date=min(olhcv.index), end_date=max(olhcv.index))
date
DatetimeIndex(['1989-01-31 00:00:00+00:00', '1989-02-01 00:00:00+00:00',
           '1989-02-02 00:00:00+00:00', '1989-02-03 00:00:00+00:00',
           '1989-02-06 00:00:00+00:00', '1989-02-07 00:00:00+00:00',
           '1989-02-08 00:00:00+00:00', '1989-02-09 00:00:00+00:00',
           '1989-02-10 00:00:00+00:00', '1989-02-13 00:00:00+00:00',
           ...
           '2019-03-01 00:00:00+00:00', '2019-03-04 00:00:00+00:00',
           '2019-03-05 00:00:00+00:00', '2019-03-06 00:00:00+00:00',
           '2019-03-07 00:00:00+00:00', '2019-03-08 00:00:00+00:00',
           '2019-03-11 00:00:00+00:00', '2019-03-12 00:00:00+00:00',
           '2019-03-13 00:00:00+00:00', '2019-03-14 00:00:00+00:00'],
          dtype='datetime64[ns, UTC]', length=7589, freq='C')

However, when I try to slide the first dataframe with the new index:

olhcv2 = olhcv.loc[date]

Traceback (most recent call last):

File "<ipython-input-139-8a6e732943bb>", line 1, in <module>
olhcv2 = olhcv.loc[date]

File "/Users/luca/miniconda3/lib/python3.7/site-packages/pandas/core/indexing.py", line 1500, in __getitem__
return self._getitem_axis(maybe_callable, axis=axis)

File "/Users/luca/miniconda3/lib/python3.7/site-packages/pandas/core/indexing.py", line 1902, in _getitem_axis
return self._getitem_iterable(key, axis=axis)

File "/Users/luca/miniconda3/lib/python3.7/site-packages/pandas/core/indexing.py", line 1205, in _getitem_iterable
raise_missing=False)

File "/Users/luca/miniconda3/lib/python3.7/site-packages/pandas/core/indexing.py", line 1161, in _get_listlike_indexer
raise_missing=raise_missing)

File "/Users/luca/miniconda3/lib/python3.7/site-packages/pandas/core/indexing.py", line 1246, in _validate_read_indexer
key=key, axis=self.obj._get_axis_name(axis)))

KeyError: "None of [DatetimeIndex(['1989-01-31 00:00:00+00:00', '1989-02-01 00:00:00+00:00',\n               '1989-02-02 00:00:00+00:00', '1989-02-03 00:00:00+00:00',\n               '1989-02-06 00:00:00+00:00', '1989-02-07 00:00:00+00:00',\n               '1989-02-08 00:00:00+00:00', '1989-02-09 00:00:00+00:00',\n               '1989-02-10 00:00:00+00:00', '1989-02-13 00:00:00+00:00',\n               ...\n               '2019-03-01 00:00:00+00:00', '2019-03-04 00:00:00+00:00',\n               '2019-03-05 00:00:00+00:00', '2019-03-06 00:00:00+00:00',\n               '2019-03-07 00:00:00+00:00', '2019-03-08 00:00:00+00:00',\n               '2019-03-11 00:00:00+00:00', '2019-03-12 00:00:00+00:00',\n               '2019-03-13 00:00:00+00:00', '2019-03-14 00:00:00+00:00'],\n              dtype='datetime64[ns, UTC]', length=7589, freq='C')] are in the [index]"

I believe the 2 indexes have some differences(timezone,..). How can i deal with that ?

Thank you

使用DatetimeIndex.tz_convertNone

olhcv2 = olhcv.loc[date.tz_convert(None)]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM