简体   繁体   English

R中的netCDF文件

[英]netCDF files in R

I have netCDF file obtained from here with name precip.mon.total.v6.nc . 我有从此处获得的netCDF文件,名称为precip.mon.total.v6.nc I am using ncdf package in R to open and analyse the file. 我在R中使用ncdf包打开和分析文件。

 new <- open.ncdf("precip.mon.total.v6.nc")
    > new
    [1] "file precip.mon.total.v6.nc has 4 dimensions:"
    [1] "lat   Size: 360"
    [1] "lon   Size: 720"
    [1] "nbnds   Size: 2"
    [1] "time   Size: 1320"
    [1] "------------------------"
    [1] "file precip.mon.total.v6.nc has 1 variables:"
    [1] "float precip[lon,lat,time]  Longname:GPCC Monthly total of     precipitation Missval:-9.96920996838687e+36"

But when I extract the variable, I got the error 但是当我提取变量时,我得到了错误

      > get.var.ncdf(new, "precip")
Error: cannot allocate vector of size 2.5 Gb
In addition: Warning messages:
1: In double(totvarsize) :
  Reached total allocation of 2047Mb: see help(memory.size)
2: In double(totvarsize) :
  Reached total allocation of 2047Mb: see help(memory.size)
3: In double(totvarsize) :
  Reached total allocation of 2047Mb: see help(memory.size)
4: In double(totvarsize) :
  Reached total allocation of 2047Mb: see help(memory.size)

My queries are: (a) How to handle memory issue? 我的查询是:(a)如何处理内存问题? (b) How can I change the resolution of this netCDF file from 0.5*0.5 to 0.25*0.25 data? (b)如何将该netCDF文件的分辨率从0.5 * 0.5更改为0.25 * 0.25数据? I have tried the similar problem in MATLAB. 我已经在MATLAB中尝试过类似的问题。 It can tackle memory issue better than R for netCDF files. 对于netCDF文件,它可以比R更好地解决内存问题。 But changing resolution is still a problem as I am not good at MATLAB. 但是更改分辨率仍然是一个问题,因为我不擅长MATLAB。 I will be very thankful for any help in this direction. 在这方面的任何帮助,我将非常感谢。

When you extract your variable, you need to specify which dimensions you want. 提取变量时,需要指定所需的尺寸。 Currently you're asking R to get everything and so I suspect it's creating a 3D array which will likely be enormous. 目前,您正在要求R获得所有内容,因此我怀疑它正在创建一个3D阵列,该阵列可能很大。

The ncdf4 package generally supersedes ncdf, you should try using that instead. ncdf4软件包通常会取代ncdf,您应该尝试使用它代替。 You need to decide if you want to read data by location for time or by time step for location. 您需要确定是要按时间读取数据还是按时间步读取数据。 This is easier to envisage on a plain 2D grid: 这在普通的2D网格上更容易设想:

  • Single cell at all time steps 随时随地使用单个单元格
  • All locations single time step 所有位置一次完成

Yours is a 3D grid through time (albeit with the 3rd dimension only two bands), however it looks like your variable isn't using the bands dimension. 您的时间跨度是3D网格(尽管第3维只有两个带),但是您的变量似乎未使用带维。 Here's a 2D workflow based on ncdf4, ignoring your bands: 这是一个基于ncdf4的2D工作流程,忽略了您的乐队:

Package: 包:

install.packages("ncdf4")
library(ncdf4)

Open connection: 打开连接:

nc = nc_open("~/dir/dir/file.nc")

For a grid at one time step 一次生成一个网格

Read dimensions: 阅读尺寸:

precip = list()
precip$x = ncvar_get(nc, "lon")
precip$y = ncvar_get(nc, "lat")

Read data (note start is the index in dimensions to begin and count is how many observations from that point, so here we read the whole grid at the first time step): 读取数据(注意start是开始的维度索引,count是从该点开始的观察数,因此这里我们在第一步中读取了整个网格):

precip$z = ncvar_get(nc, "precip", start=c(1, 1, 1), count=c(-1, -1, 1))
# Convert to a raster if required
precip.r = raster(precip)

To read a single cell at all time steps 随时读取单个单元格

You need to find your cell index, precip$x and precip$y will help. 您需要找到单元precip$x引, precip$xprecip$y会有所帮助。 Once you have it (eg cell x=5 and y=10): 一旦拥有它(例如,单元格x = 5和y = 10):

precip.cell = ncvar_get(nc, "precip", start=c(5, 10, 1), count=c(1, 1, -1))

(a) memory: (一)记忆:

If on a linux box [sudo apt-get install cdo] (or also windows with cygwin installed) you can use cdo to help you out. 如果在Linux机器上[sudo apt-get install cdo](或安装了cygwin的Windows)上,则可以使用cdo来帮助您。

For example, if you are only interested in a specific date you can select that first to keep the file size down: 例如,如果您只对特定日期感兴趣,则可以先选择该日期以减小文件大小:

cdo seldate,date in.nc out.nc 

or you might have wanted to view the time mean: 或者您可能想查看时间平均值:

cdo timmean in.nc out.nc 

That will keep the file size down, and you can then open it in R to make your plot (or use ncview for a quickview investigation). 这样可以减小文件的大小,然后您可以在R中打开它进行绘图(或使用ncview进行快速视图调查)。

(b) remapping (b)重新映射

cdo can also interpolate the file to 0.25 degrees, (although I am not sure why you would want to do this, as you are not adding any information and you are making the files four times larger!!!) cdo还可以将文件插值到0.25度(尽管我不确定为什么要这样做,因为您没有添加任何信息,并且使文件变大了四倍!!!)

cdo remapcon,r1440x720 in.nc out.nc

or 要么

cdo remapnn,r1440x720 in.nc out.nc

But as I said, if you want to interpolate to compare to another 0.25 degree product (eg TRMM), better to go the other way and interpolate the finer dataset to 0.5 degree. 但是正如我所说,如果要进行插值以与另一个0.25度乘积(例如TRMM)进行比较,则最好采用另一种方法,并将更好的数据集插值至0.5度。

By the way, in 2015, v7 was released of GPCC, still at 0.5 degree. 顺便说一句,2015年,GPCC的v7版本仍然是0.5度。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM