简体   繁体   English

Python - 两个 netcdf 文件之间的 xarray 平均值

[英]Python - xarray mean between two netcdf files

I have yearly nc files, and each one of them contain daily min and max temperature data.我每年都有 nc 文件,每个文件都包含每日最低和最高温度数据。

What I want to do, is to obtain the average temperature with those two variables.我想要做的是获得这两个变量的平均温度。

I thought that with xarray would be easier, I've managed to merge all files into one like this:我认为使用 xarray 会更容易,我设法将所有文件合并为一个,如下所示:

import netCDF4 as nc
import numpy as np
import xarray

tmin = xarray.open_mfdataset('TMIN*.nc',combine = 'by_coords', concat_dim="time")


tmax = xarray.open_mfdataset('TMAX*.nc',combine = 'by_coords', concat_dim="time")

Then, I tried to do something like: tavg = (tmax - tmin) / 2然后,我尝试执行以下操作: tavg = (tmax - tmin) / 2

But I got an empty array (shown below):但是我得到了一个空数组(如下所示):

<xarray.Dataset>
Dimensions:  (lat: 294, lon: 402, time: 25567)
Coordinates:
  * lat      (lat) float32 11.9125 11.995833 12.079166 ... 36.245834 36.329166
  * lon      (lon) float32 -119.4375 -119.354164 ... -86.104164 -86.020836
  * time     (time) datetime64[ns] 1950-01-01 1950-01-02 ... 2019-12-31
Data variables:
    *empty*

How can I get the mean between the two variables for each day?如何获得每天两个变量之间的平均值?

As suggested, here are the summaries for both tmin and tmax:根据建议,以下是 tmin 和 tmax 的摘要:

<xarray.Dataset>
Dimensions:  (lat: 294, lon: 402, time: 25567)
Coordinates:
  * lon      (lon) float32 -119.4375 -119.354164 ... -86.104164 -86.020836
  * lat      (lat) float32 11.9125 11.995833 12.079166 ... 36.245834 36.329166
  * time     (time) datetime64[ns] 1950-01-01 1950-01-02 ... 2019-12-31
Data variables:
    TMAX     (time, lat, lon) float32 dask.array<chunksize=(365, 294, 402), meta=np.ndarray>


<xarray.Dataset>
Dimensions:  (lat: 294, lon: 402, time: 25567)
Coordinates:
  * lon      (lon) float32 -119.4375 -119.354164 ... -86.104164 -86.020836
  * lat      (lat) float32 11.9125 11.995833 12.079166 ... 36.245834 36.329166
  * time     (time) datetime64[ns] 1950-01-01 1950-01-02 ... 2019-12-31
Data variables:
    TMIN     (time, lat, lon) float32 dask.array<chunksize=(365, 294, 402), meta=np.ndarray>

I think your problem is that Tmin and Tmax are datasets and not dataarrays.我认为你的问题是 Tmin 和 Tmax 是数据集而不是数据数组。

If you try to add the two datasets together xarray does not know how to add the variables inside the dataset together.如果您尝试将两个数据集相加,则 xarray 不知道如何将数据集中的变量相加。 After all you can have multiple variables in one dataset.毕竟,一个数据集中可以有多个变量。

To solve this you simply select the variables inside the datasets you would like to add.要解决这个问题,您只需选择要添加的数据集中的变量。

import xarray as xr
import numpy as np

lon = np.arange(129.4, 153.75+0.05, 0.25)
lat = np.arange(-43.75, -10.1+0.05, 0.25)

Tmin = 10 * np.random.rand(len(lat), len(lon))
Tmax = 10 * np.random.rand(len(lat), len(lon))


Tmin = xr.Dataset({"Tmin": (["lat", "lon"], Tmin)},coords={"lon": lon,"lat": lat})
Tmax = xr.Dataset({"Tmax": (["lat", "lon"], Tmax)},coords={"lon": lon,"lat": lat})

# Just checking the datasets are not empty
print(Tmin)
print(Tmax)

# This will return an empty array as per your example 
tavg = (Tmax+Tmin)/2
print(tavg)

# Selecting the variable should work
tavg = (Tmax['Tmax']+Tmin['Tmin'])/2
print(tavg)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM