简体   繁体   中英

xarray - find data that is not 0 in a multi-dimensional xarray object with massive data efficiently

My DataArray object is as below:

print(da_criteria_1or0_hourly)

<xarray.DataArray (time: 8760, latitude: 106, longitude: 193)>
dask.array<shape=(8760, 106, 193), dtype=int32, chunksize=(744, 106, 193)>
Coordinates:
  * latitude   (latitude) float32 -39.2 -39.149525 ... -33.950478 -33.9
  * longitude  (longitude) float32 140.8 140.84792 140.89584 ... 149.95209 150.0
  * time       (time) datetime64[ns] 2017-01-01 ... 2017-12-31T23:00:00

The data can be either 0 or 1. The number of data is massive (179212080).

I want to get the time, latitude and longitude that meets the criteria of "data == 1".

I was trying to use the .sel function but it was extremely slow due to large number of comparisons.

for time_elem in da_criteria_1or0_hourly.coords['time'].values:
    for lat_elem in da_criteria_1or0_hourly.coords['latitude'].values:
        for lon_elem in da_criteria_1or0_hourly.coords['longitude'].values:
            val = da_criteria_1or0_hourly.sel(time=time_elem,latitude=lat_elem,longitude=lon_elem).values
            if (val == 1):
                print(time_elem, lat_elem, lon_elem, val)

Is there any more efficient way?

You may want to have a look at the stack function. It stacks the xarray with all entries below each other and you then might be able to filter for all values that do not meet your requirements. I have not tested it with a super large data-set, but it does not use a triple for-loop, so might give you some speed boost.

The code structure would look like:

    newArr = da_criteria_1or0_hourly.stack(z=('time','latitude','longitude'))
    newArr2 = newArr[newArr.values ==1]

Then the newArr would be your old array stacked and the newArr2 would contain only your data = 1 observations and should still contain your coordinates (although maybe in a messy format).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM