嵌套用于R中的NetCDF的循环

Question

我非常感谢所有想帮助我解决我确实遇到的问题的人。 但提前：这是一个复杂的话题，我会尽力解释我打算对我的代码做些什么。 它与NetCDF文件中的气候数据有关，其中包含1971年至2000年以及2071年至2100年时间的月度温度（tas）和降水量（pr）数据。 440x400网格点（欧洲地图）。 未来期间的nc文件包含1x1网格点（对于感兴趣的城市）。 每个网格点具有360个温度或降水值（取决于模型），在30年期间的每个月都有一个值。 换句话说：每个网格点都有360个点的分布。 现在，我想迭代计算单个城市网格点（2071-2100）的分布与每个欧洲（1971-2000）网格点的分布之间的统计差异。 我将获得每个欧洲网格点的平均平均距离。 这个想法是在欧洲网格栅格中找到其温度或降水分布与将来感兴趣的城市的分布最相似的网格点。 我必须针对30种不同的气候模型进行计算。

# List filenames of the directory

hist.files <- list.files("/historical", full.names = TRUE)
rcp.files <- list.files("/rcp", full.names = TRUE)

#Create array for desired ‘similarity indices’. One matrix per climate model run.

sim.array <- array(NA, dim = c(440,400,30))

#Looping through the models of the period 1971-2000. Some containing precipitation data others temperature (see if…else) 

for(k in 1:length(hist.files))   {
        hist.data <- nc_open(hist.files[k])   

   if(grepl("pr", hist.data$filename)){
    hist.tas <- ncvar_get(hist.data, "pr")
        }else{
    hist.tas <- ncvar_get(hist.data, "tas") 
    hist.tas <- kelvin.to.celsius(hist.tas, round=2)
   }

#Looping through the models of the 2071 to 2100 period (city). Some containing precipitation data others temperature (see if…else)

for(r in 1:length(rcp.files)) {
    rcp.data <- nc_open(rcp.files[r])
    if(grepl("pr", rcp.data$filename)){
    rcp.tas <- ncvar_get(rcp.data, "pr") 
        }else{
    rcp.tas <- ncvar_get(rcp.data, "tas")
    rcp.tas <- kelvin.to.celsius(rcp.tas, round=2)
        }

#This if statement because hist contains more models than rcp and I want to exclusively use the models contained in both of them.  

if(hist.data %in% rcp.data) {  

#Looping through the grid points of ‘hist’ model k. Lastly the function that calculates for each grid point of the model a difference value (always to the one grid point of ‘rcp’). My idea of the break statement was to loop nrow and ncol the same times, but I’m not sure if break does what I intended to.       

for(i in 1:nrow(hist.tas)) { 
       for(j in 1:ncol(hist.tas)) {
    sim.array[i,j,k] <- abs(sum(rcp.tas - hist.tas[i,j,])/360)
break
    }
  print(sim.array[i,j,k])
  }
 }
}   
}
sim.array[1,1,1]

好吧，我得到了一个充满NA的数组。 没有错误消息，但是出了点问题！ 有人可以找到错误吗？ 感谢您的帮助。 提前非常感谢您！

更新：您的建议似乎是一个好的解决方案！ 到现在为止，我还没有时间应用它们，但是我稍后再做！ 我一直在考虑矢量化，但是没有设法从3维数组中制作出矢量，而最后没有一个充满不同矢量的混乱代码……我都不知道如何删除与hist和rcp不匹配的模型。 通过相交和％in％，我知道了不匹配文件的索引...但是，必须有比手动记录所有这些索引来删除更好的方法，不是吗？ 请查看一些历史文件名：

> hist.files.tas <- list.files("/historical", full.names = TRUE, pattern = "tas")
> hist.files.tas
 [1] "/historical/tas_CNRM-CERFACS-CNRM-CM5_CLMcom-CCLM4-8-17_r1i1p1.nc"   
 [2] "/historical/tas_CNRM-CERFACS-CNRM-CM5_CNRM-ALADIN53_r1i1p1.nc"       
 [3] "/historical/tas_CNRM-CERFACS-CNRM-CM5_RMIB-UGent-ALARO-0_r1i1p1.nc"  
 [4] "/historical/tas_CNRM-CERFACS-CNRM-CM5_SMHI-RCA4_r1i1p1.nc"           
 [5] "/historical/tas_ICHEC-EC-EARTH_CLMcom-CCLM4-8-17_r12i1p1.nc"         
 [6] "/historical/tas_ICHEC-EC-EARTH_DMI-HIRHAM5_r3i1p1.nc"                
 [7] "/historical/tas_ICHEC-EC-EARTH_KNMI-RACMO22E_r12i1p1.nc"             
 [8] "/historical/tas_ICHEC-EC-EARTH_KNMI-RACMO22E_r1i1p1.nc"              
 [9] "/historical/tas_ICHEC-EC-EARTH_SMHI-RCA4_r12i1p1.nc"                 
[10] "/historical/tas_IPSL-IPSL-CM5A-MR_INERIS-WRF331F_r1i1p1.nc"          
[11] "/historical/tas_IPSL-IPSL-CM5A-MR_SMHI-RCA4_r1i1p1.nc"               
[12] "/historical/tas_MOHC-HadGEM2-ES_CLMcom-CCLM4-8-17_r1i1p1.nc"         
[13] "/historical/tas_MOHC-HadGEM2-ES_KNMI-RACMO22E_r1i1p1.nc"             
[14] "/historical/tas_MOHC-HadGEM2-ES_SMHI-RCA4_r1i1p1.nc"

还有更多具有tasmax和tasmin变量的模型。 hist总共有71个文件，而rcp只有30个。您能给我一个例子，说明如何编写一个自动代码删除不匹配的hist文件吗？ 非常感谢！

Answer 1

在我看来，以下内容毫无意义，并且始终为假：

if (hist.data %in% rcp.data)

所以sim_array什么也没有发生

我将从做这样的事情开始：

hist.files.pr <- list.files("/historical", full.names = TRUE, pattern="pr")
hist.files.tas <- list.files("/historical", full.names = TRUE, pattern="tas")
rcp.files.pr <- list.files("/rcp", full.names = TRUE, pattern="pr")
rcp.files.tas <- list.files("/rcp", full.names = TRUE, pattern="tas")

此时，对于不在“ rcp”中的模型，您可以从“历史”中删除文件

hist.files.tas <- c( "/historical/tas_CNRM-CERFACS-CNRM-CM5_CLMcom-CCLM4-8-17_r1i1p1.nc", "/historical/tas_CNRM-CERFACS-CNRM-CM5_CNRM-ALADIN53_r1i1p1.nc", "/historical/tas_CNRM-CERFACS-CNRM-CM5_RMIB-UGent-ALARO-0_r1i1p1.nc", "/historical/tas_CNRM-CERFACS-CNRM-CM5_SMHI-RCA4_r1i1p1.nc", "/historical/tas_ICHEC-EC-EARTH_CLMcom-CCLM4-8-17_r12i1p1.nc", "/historical/tas_ICHEC-EC-EARTH_DMI-HIRHAM5_r3i1p1.nc", "/historical/tas_ICHEC-EC-EARTH_KNMI-RACMO22E_r12i1p1.nc", "/historical/tas_ICHEC-EC-EARTH_KNMI-RACMO22E_r1i1p1.nc", "/historical/tas_ICHEC-EC-EARTH_SMHI-RCA4_r12i1p1.nc", "/historical/tas_IPSL-IPSL-CM5A-MR_INERIS-WRF331F_r1i1p1.nc", "/historical/tas_IPSL-IPSL-CM5A-MR_SMHI-RCA4_r1i1p1.nc", "/historical/tas_MOHC-HadGEM2-ES_CLMcom-CCLM4-8-17_r1i1p1.nc", "/historical/tas_MOHC-HadGEM2-ES_KNMI-RACMO22E_r1i1p1.nc", "/historical/tas_MOHC-HadGEM2-ES_SMHI-RCA4_r1i1p1.nc")

# in this example, fut files is a subset of hist files; that should be OK if their filename structure is the same

rcp.files.tas <- hist.files.tas[1:7]

getModels <- function(ff) {
    base <- basename(ff)
    s <- strsplit(base, "_")
    sapply(s, function(i) i[[2]])
}

getHistModels <- function(hist, fut) {
    h <- getModels(hist)
    uh <- unique(h)
    uf <- unique(getModels(fut))
    uhf <- uh[uh %in% uf]
    hist[h %in% uhf]
}


hist.files.tas.selected <- getHistModels(hist.files.tas, rcp.files.tas)
# hist.files.pr.selected <- getHistModels(hist.files.pr, rcp.files.pr)

可以通过执行以下操作避免双循环（k，r）：

library(raster)
his.pr <- values(stack(hist.files.pr.selected, var="pr")))
his.tas <- values(stack(hist.files.tas.selected, var="tas"))
rcp.pr <- values(stack(hist.files.pr, var="pr"))
rcp.tas <- values(stack(hist.files.tas, var="tas"))

并且也可以避免在行和列上的（i，j）循环。 R被向量化。 也就是说，您可以执行(1:10) - 2 。

无论哪种方式，使用所有这些嵌套循环都很难阅读代码。 如果您确实需要它们，则最好调用函数。 要获得更多帮助，请提供一些示例数据而不是我们没有的文件，或者提供一些文件。

Answer 2

因为在我的数据集中，除了“ tas”和“ pr”之外，实际上还有另外两个变量“ tasmax”和“ tasmin”，Robert的方法对我的案例来说要写得多。 因此，我尝试了另一种方法，终于解决了，尽管它没有单独列出每个变量的文件（缺点是，是的！）。

列出和匹配历史文件和rcp文件：

要匹配文件，我需要不带目录的文件的纯名称，否则，（（hist％in％rcp）始终为FALSE（如Robert所示）。

hist <-list.files（“ / historical”）rcp <-list.files（“ / rcp26”）

no.match.h <-which（！hist％in％rcp）no.match.r <-which（！rcp％in％hist）

因为我需要nc_open，包括目录的文件名，所以我必须创建一个相应的文件列表并减去不匹配的文件

hist.files <-list.files（“ / data / scratch / lorchdav / cordex_eur / monmean / historical”，full.names = TRUE）rcp.files <-list.files（“ / data / scratch / lorchdav / cordex_ber_mean / rcp26 “，full.names = TRUE）

hist.files.cl <-hist.files [-no.match.h] hist.files.cl

rcp.files.cl <-rcp.files [-no.match.r] rcp.files.cl

嵌套用于R中的NetCDF的循环

问题描述

2 个解决方案

解决方案1
1 2018-02-20 04:34:57

解决方案2
0 2018-03-02 11:11:36

嵌套用于R中的NetCDF的循环

问题描述

2 个解决方案

解决方案1 1 2018-02-20 04:34:57

解决方案2 0 2018-03-02 11:11:36

解决方案1
1 2018-02-20 04:34:57

解决方案2
0 2018-03-02 11:11:36