简体   繁体   English

在python中读取NetCDF文件

[英]Read NetCDF file in python

I am trying to read a NetCDF file from IRI/LDEO Climate Data Library (dust_pm25_sconc10_mon) but I am with a problem to read this file. 我正在尝试从IRI / LDEO气候数据库(dust_pm25_sconc10_mon)读取NetCDF文件,但我有一个问题是要读取此文件。 When I select the variables that compuse the database (longitude (X), latitude (Y) and time (T)), the output from X and Y are a sequence with the number of observations (1, 2, ..., 139 for example). 当我选择压缩数据库的变量(经度(X),纬度(Y)和时间(T))时,X和Y的输出是一个具有观察数量的序列(1,2,...,139)例如)。 That is, the values of longitude and latitude are not exported corretly. 也就是说,经度和纬度的值不会被正确导出。

Someone could help me with this problem? 有人可以帮我解决这个问题吗? I already tried read this file with R, Python and Qgis and in all of these threes the output of X and Y are the same. 我已经尝试用R,Python和Qgis读取这个文件,并且在所有这三个中,X和Y的输出是相同的。

My codes are below (Python). 我的代码在下面(Python)。

Thank you all very much. 非常感谢你们。

from netCDF4 import Dataset as dt

filestr = 'dust_pm25_sconc10_mon.nc'

ncfile = dt(filestr, 'r')

print(ncfile.variables)

lat = ncfile.variables['Y'][:]
lat

lon = ncfile.variables['X'][:]
lon

time = ncfile.variables['T'][:]
time

Edit: 编辑:

This file has three independent variables, X, Y, and T. And the values of X and Y intentionally go from 1 to len(X) and len(Y) respectively. 该文件有三个独立的变量,X,Y和T.并且X和Y的值分别有意地从1到len(X)和len(Y)。

Look at the description of the file: http://iridl.ldeo.columbia.edu/home/.nasa_roses_a19/.Dust_model/.dust_mon_avg/.dust_pm25_sconc10_mon/ 查看该文件的描述: http//iridl.ldeo.columbia.edu/home/.nasa_roses_a19/.Dust_model/.dust_mon_avg/.dust_pm25_sconc10_mon/

Independent Variables (Grids) 独立变量(网格)
Time 时间
grid: /T (months since 1960-01-01) ordered (Mar 1979) to (Mar 2010) by 1.0 N= 373 pts :grid 网格:/ T(自1960-01-01以来的几个月)订购(1979年3月)至(2010年3月)1.0 N = 373分:网格
Longitude 经度
grid: /X (unitless) ordered (1.0) to (191.0) by 1.0 N= 191 pts :grid grid:/ X(无单位)有序(1.0)到(191.0)1.0 N = 191 pts:grid
Latitude 纬度
grid: /Y (unitless) ordered (1.0) to (139.0) by 1.0 N= 139 pts :grid grid:/ Y(无单位)有序(1.0)到(139.0)1.0 N = 139 pts:grid

Of course, this might be meaningful for longitude, but for latitude this is nonsense. 当然,这对于经度来说可能是有意义的,但对于纬度而言,这是无稽之谈。 Unfortunately, I did not find any hint which area on this planet this dataset should describe. 不幸的是,我没有发现这个数据集应该描述的这个星球上哪个区域的暗示。

However, I also did not find any data in it's only dependent variable dust_pm25_sconc10_mon - it's empty. 但是,我也没有找到任何数据,它只是因变量dust_pm25_sconc10_mon - 它是空的。

PS: Just as an example: PS:举个例子:
This dataset here http://iridl.ldeo.columbia.edu/home/.nasa_roses_a19/.Dust_model/.RegDustModelProjected/.dust_pm25_sconc10/datafiles.html 此数据集http://iridl.ldeo.columbia.edu/home/.nasa_roses_a19/.Dust_model/.RegDustModelProjected/.dust_pm25_sconc10/datafiles.html
looks much more reasonable... 看起来更合理......

The description alone is much more promising: 仅仅描述更有希望:

Independent Variables (Grids) 独立变量(网格)
Time (time) 时间 (时间)
grid: /T (days since 2009-01-02 00:00) ordered (0130-0430 2 Jan 2009) to (2230 1 Apr 2010 - 0130 2 Apr 2010) by 0.125 N= 3640 pts :grid 网格:/ T(2009-01-02 00:00以后的天数)订购(2010年4月1日0130-0430)(2010年4月1日2230 - 0130 2010年4月2日)0.125 N = 3640点:网格
Longitude 经度
grid: /X (degree_east) ordered (19.6875W) to (54.6875E) by 0.625 N= 120 pts :grid grid:/ X(degree_east)有序(19.6875W)到(54.6875E)by 0.625 N = 120 pts:grid
Latitude 纬度
grid: /Y (degree_north) ordered (0.3125N) to (39.6875N) by 0.625 N= 64 pts :grid grid:/ Y(degree_north)有序(0.3125N)到(39.6875N)0.625 N = 64 pts:grid

And its dependent variable dust_pm25_sconc10 is also not empty. 其因变量dust_pm25_sconc10也不为空。


I really tried to find this file on the website you mentioned, but it is futile imo. 我真的试图在你提到的网站上找到这个文件,但这是徒劳的。 So without knowing it, I have to guess: 所以不知道,我不得不猜测:

netcdf-files provide the possibility to save data space by scaling and shifting the values of any variable so that they can be stored eg as int instead of float . netcdf文件提供了通过缩放和移动任何变量的值来保存数据空间的可能性,以便它们可以存储,例如int而不是float
You could simply check, if there are attributes add_offset other than 0 and scale_factor other than 1. 如果属性add_offset不是0而scale_factor不是1,你可以简单地检查一下。

For further information about this concept you can refer to https://www.unidata.ucar.edu/software/netcdf/workshops/2010/bestpractices/Packing.html . 有关此概念的更多信息,请参阅https://www.unidata.ucar.edu/software/netcdf/workshops/2010/bestpractices/Packing.html

While the information in the link above states that the java interface to netcdf does apply these attributes automatically, the netcdf4-python library does not. 虽然上面链接中的信息表明netcdf的java接口会自动应用这些属性,但netcdf4-python库却没有。 So if you want to stay with this package, you have to rescale and -offset the data back to the original values as described. 因此,如果您希望继续使用此程序包,则必须按照所述将数据重新缩放和偏移回原始值。

However, you could also consider trying out xarray , a library which implements the n-dimensional datastructure of netcdf files and as far ss I experienced, this library does automatic scaling and offsetting according to the rules described above. 但是,您也可以考虑尝试xarray ,这是一个实现netcdf文件的n维数据结构的库,并且就我所经历的情况而言,该库根据上述规则进行自动缩放和偏移。
http://xarray.pydata.org/en/stable/ http://xarray.pydata.org/en/stable/

The example file at http://iridl.ldeo.columbia.edu/home/.nasa_roses_a19/.Dust_model/.dust_mon_avg/.dust_pm25_sconc10_mon/datafiles.html that you linked in your comment on SpghttCd's response is not well-formed. 您在SpghttCd的回复评论中链接的http://iridl.ldeo.columbia.edu/home/.nasa_roses_a19/.Dust_model/.dust_mon_avg/.dust_pm25_sconc10_mon/datafiles.html中的示例文件格式不正确。 For one thing, the X and Y arrays do not have units attributes appropriate to such dimensions but instead both have value "units". 首先,X和Y数组没有适合此类维度的单位属性,而是都具有值“单位”。 And as already noted the values in the arrays don't "look" valid anyway. 并且如前所述,数组中的值无论如何都“看起来”无效。 Further, the values in the dust_pm25_sconc10_mon array in that file all appear to be NaN. 此外,该文件中dust_pm25_sconc10_mon数组中的值都显示为NaN。

On the other hand the example dataset at http://iridl.ldeo.columbia.edu/home/.nasa_roses_a19/.Dust_model/.RegDustModelProjected/.dust_pm25_sconc10/datafiles.html that SpghttCd references has good units attribute information ("degrees_east" and "degrees_north", respectively). 另一方面,SpghttCd引用的http://iridl.ldeo.columbia.edu/home/.nasa_roses_a19/.Dust_model/.RegDustModelProjected/.dust_pm25_sconc10/datafiles.html中的示例数据集具有良好的单位属性信息(“degrees_east”和“degrees_north”,分别)。 Furthermore, the actual values in the X and Y arrays look good. 此外,X和Y数组中的实际值看起来很好。 I had no problem making a plot of the dust_pm25_sconc10 variable in that dataset (using Panoply) and seeing the data mapped over the appropriate region. 我在该数据集中使用dust_pm25_sconc10变量绘图(使用Panoply)并查看映射到适当区域的数据没有问题。

SpghttCd's comments regarding scaling and offsets do not apply here as the longitude and latitudes in that second, good file have actual lon and lat values. SpghttCd关于缩放和偏移的评论在这里不适用,因为第二个好文件中的经度和纬度具有实际的lon和lat值。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM