简体   繁体   中英

xarray - select/index DataArray from the time labels from another DataArray

I have two DataArray objects, called " A " and " B ".

Besides Latitude and Longitude , both of them have a time dimension denoting daily data. A has a smaller time coordinates than B .

A's time dimension:

<xarray.DataArray 'time' (time: 1422)>
array(['2015-03-30T00:00:00.000000000', '2015-06-14T00:00:00.000000000',
       '2015-06-16T00:00:00.000000000', ..., '2019-08-31T00:00:00.000000000',
       '2019-09-01T00:00:00.000000000', '2019-09-02T00:00:00.000000000'],
      dtype='datetime64[ns]')
Coordinates:
  * time     (time) datetime64[ns] 2015-03-30 2015-06-14 ... 2019-09-02

B's time dimension:

<xarray.DataArray 'time' (time: 16802)>
array(['1972-01-01T00:00:00.000000000', '1972-01-02T00:00:00.000000000',
       '1972-01-03T00:00:00.000000000', ..., '2017-12-29T00:00:00.000000000',
       '2017-12-30T00:00:00.000000000', '2017-12-31T00:00:00.000000000'],
      dtype='datetime64[ns]')
Coordinates:
  * time     (time) datetime64[ns] 1972-01-01 1972-01-02 ... 2017-12-31

Obviously, the A's time dimension is a subset of B's time dimension. I would like to select data from B using the all the time labels from A. As the time in A is not continuous I don't think slice is suitable. So I tried using sel .

B_sel = B.sel(time=A.time)

I received an error: KeyError: "not all values found in index 'time'"

A_new = A.where(A.time.isin(B.time), drop=True)

http://xarray.pydata.org/en/stable/user-guide/indexing.html

Obviously, the A's time dimension is a subset of B's time dimension.

I received an error: KeyError: "not all values found in index 'time'"

The error message is suggestive in itself that the assumption made in statement one is wrong. Also, if you look at your time values carefully A has values until 2019 whereas B ends in 2017.

So, there are 2 ways to solve this:

  1. If you're sure that A has all the values in B up till 2017 then

    sel_dates = A.time.values[A.time.dt.year < 2017] B_sel = B.sel(time=sel_dates)
  2. If you're not sure about the values in A being continuous because of some unexpected values in somewhere then you can perform an element-wise check using np.isin() which is one of the speed-optimised numpy functions

    sel_dates = A.time.values[np.isin(A.time.values, B.time.values)] ## example ## ## dates1 is an array of daily dates of 1 month dates1 = np.arange('2005-02', '2005-03', dtype='datetime64[D]') dates2 = np.array(['2005-02-03', '2002-02-05', '2000-01-05'], dtype='datetime64') # checking for dates2 which are a part of dates 1 print(np.isin(dates2, dates1)) >>array([ True, False, False])

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM