简体   繁体   中英

How to index data based on multi-dimension physical coordinates of xarray

Here is a content of xarray.DataArray T2 , which is a variable of netcdf file.

t2
Out[107]: 
<xarray.DataArray 'T2' (Time: 37, south_north: 87, west_east: 87)>
array([[[ 301.933167,  301.936584, ...,  301.620209,  301.607941],
[ 301.920776,  301.924011, ...,  301.599274,  301.586975],
..., 
[ 301.045288,  301.036804, ...,  300.311218,  300.303253],
[ 301.041595,  301.033081, ...,  300.309479,  300.301727]],
[[ 296.742706,  296.72821 , ...,  296.217377,  296.214142],
[ 296.763031,  296.739899, ...,  296.148071,  296.144348],
..., 
[ 296.089752,  296.044952, ...,  295.49353 ,  295.468292],
[ 296.100159,  296.062836, ...,  295.492157,  295.468506]],
..., 
[[ 300.907532,  300.923737, ...,  298.770752,  298.690582],
[ 300.775482,  300.850494, ...,  298.792206,  298.726898],
..., 
[ 300.170013,  300.139709, ...,  298.117035,  298.107849],
[ 299.788116,  299.756744, ...,  298.397705,  298.410217]],
[[ 299.074066,  299.143402, ...,  296.732635,  296.73407 ],
[ 299.060425,  299.158508, ...,  296.767517,  296.765015],
..., 
[ 298.227905,  298.278107, ...,  297.223846,  297.228607],
[ 298.114319,  298.189362, ...,  297.272247,  297.263367]]], dtype=float32)
Coordinates:
XLAT     (Time, south_north, west_east) float32 39.3806 39.3806 39.3807 ...
XLONG    (Time, south_north, west_east) float32 -86.6092 -86.5972 ...
XTIME    (Time) datetime64[ns] 2012-09-01 2012-09-01T02:00:00 ...
Dimensions without coordinates: Time, south_north, west_east
Attributes:
FieldType:    104
MemoryOrder:  XY 
description:  TEMP at 2 M
units:        K
stagger:   

And the logical coordinates are south_north , west_east , we can select a certain value of some location by t2.sel() with integer index.

t2.sel(south_north=1,west_east=2)
Out[109]: 
<xarray.DataArray 'T2' (Time: 37)>
array([ 301.927094,  296.76532 ,  295.752228,  295.106781,  294.282013,
294.570129,  294.170654,  297.319458,  300.523773,  301.585907,
301.843323,  301.846832,  299.142914,  297.261993,  296.037292,
296.437103,  295.210114,  294.92511 ,  295.933716,  296.18924 ,
297.529388,  298.79248 ,  299.271606,  298.389435,  296.373444,
294.850067,  294.345612,  294.64975 ,  294.914612,  295.015869,
294.738556,  296.015442,  298.850769,  300.69281 ,  301.37915 ,
300.956238,  299.171387], dtype=float32)
Coordinates:
XLAT     (Time) float32 39.3899 39.3899 39.3899 39.3899 39.3899 39.3899 ...
XLONG    (Time) float32 -86.5853 -86.5853 -86.5853 -86.5853 -86.5853 ...
XTIME    (Time) datetime64[ns] 2012-09-01 2012-09-01T02:00:00 ...
Dimensions without coordinates: Time
Attributes:
FieldType:    104
MemoryOrder:  XY 
description:  TEMP at 2 M
units:        K
stagger:      

However, I am confused by how to select data by the physical coordinate (ie XLAT , XLONG ), which correspond to actual longitude and latitude, since XLAT and XLONG are also multi-dimension.

For example, I want to get the data of location of (39.3807, -86.5972) , what is the best method?

Note: The value of location may be flexible, since we have the "nearest" method.

A workaround method is to use the below function to find the nearest indices, which you can use to select your data from T2.

import numpy as np
# Define naive_fast that searches for the nearest model grid cell center
def naive_fast(latvar,lonvar,lat0,lon0):
    # Read latitude and longitude from file into numpy arrays
    latvals = latvar[:]
    lonvals = lonvar[:]
    dist_sq = (latvals-lat0)**2 + (lonvals-lon0)**2
    minindex_flattened = dist_sq.argmin()  # 1D index of min element
    iy_min,ix_min = np.unravel_index(minindex_flattened, latvals.shape)
    return iy_min,ix_min

Would be called like:

clat = 39.3807
clon = -86.5972
(c_y, c_x) = naive_fast(T2.XLAT.values, T2.XLON.values, clat, clon)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM