简体   繁体   English

使用NA输入的Rcpp R崩溃/中止

[英]R crashes/aborts using Rcpp with NA input

I want to process two raster images (Ra and Rb), with Ra is the pixel value itself and Rb the values of its neighbors. 我想处理两个光栅图像(Ra和Rb),其中Ra是像素值本身,Rb是其邻居的值。 taking sum as an example, assuming an 3*3 neighbors, for each pixel in Ra I will add its value to values of the neighbor pixels in Rb, an finally i will get another image. 以求和为例,假设3 * 3的邻居,对于Ra中的每个像素,我将其值加到Rb中的邻居像素的值中,最后,我将得到另一个图像。

The R raster package provides a focal function, which works only on one image input, i tried to modify the C++ code ( enter link description here ) to accept two image input using Rcpp. R raster软件包提供了一个焦点功能,该功能仅在一个图像输入上起作用,我试图修改C ++代码( 在此处输入链接描述 )以使用Rcpp接受两个图像输入。 The modified code works well if there is no missing values in the input image of Rb. 如果Rb的输入图像中没有缺失值,则修改后的代码可以很好地工作。 However, R always aborts if there is NA in Rb. 但是,如果Rb中有NA,则R总是中止。 Specifically, abort at the second or third test. 具体来说,在第二或第三次测试中止。 it may be similar to this post . 它可能类似于这篇文章 however, it did not crash if no NA in the input Rb. 但是,如果输入Rb中没有NA,它也不会崩溃。 It seems i did not handle NA correctly. 看来我没有正确处理NA。 I do not have deep knowledge on C++, Can somebody help me check this? 我对C ++没有很深的了解,有人可以帮我检查一下吗?

here is my cpp file: 这是我的cpp文件:

#include <Rcpp.h>
#include <R.h>
#include <Rinternals.h>
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <Rmath.h>
#include "Rdefines.h"
#include "R_ext/Rdynload.h"

using namespace Rcpp;
// [[Rcpp::export]]
NumericVector focal_quantile(NumericVector xd, int ngbb, NumericVector sf) {
  //the imges are transfered to vector, ngbb is the size of the window  
  R_len_t i, j, k, q;
  int wrows = ngbb;
  int wcols = ngbb;
  int wn = wrows * wcols;

  int nrow = 6;//the input raste has 6 rows
  int ncol = 7;//the input raste has 7 cols

  int n = nrow * ncol;
  NumericVector xans(n);
  NumericVector xx(wn);

  int wr = floor(wrows / 2);
  int wc = floor(wcols / 2);

  int nwc = ncol - wc - 1;
  int col = 0;

  // first rows
  for (i = 0; i < ncol*wr; i++) {// the first row, the resutl is set as NA as the neighbor does not have nine values   
    xans[i] = R_NaReal; 
  }

  for (i = ncol*wr; i < (ncol * (nrow-wr)); i++) {//start from the second row
    col = i % ncol;
    if ((col < wc) | (col > nwc)) {//the first pixel of the second is also set as NA
      xans[i] = R_NaReal;
    } else {// to get the nine values in the 3*3 windows
      q = 0;
      for (j = -wr; j <= wr; j++) {
        for (k = -wc; k <= wc; k++) {
          xx[q] = xd[j * ncol + k + i]; 
          q++;
        }
      }
      xx = na_omit(xx);
      int n_qt = xx.size();
      if (n_qt > 0){//
        xans[i]=sum(xx)+100*sf[i];// here is the calculation, my goal is more complicated than this example
      } else {
        xans[i] = R_NaReal;//R_NaReal
      }

    }
  }
  // last rows
  for (i = ncol * (nrow-wr); i < n; i++) {  
    xans[i] = R_NaReal;
  }
  return(xans);
}

Then compile it using sourceCpp 然后使用sourceCpp进行编译

generate example data to test it 生成示例数据进行测试

  rr=raster(nrow=6,ncol=7)## example for Ra
  projection(rr)="+proj=lcc +lat_1=48 +lat_2=33 +lon_0=-100 +ellps=WGS84"
  rr[]=(2:43)*10
  rrqt=rr/43 ## example for Rb
  ##it works fine, if there is no NA in Ra
  #rr[1:10]=NA #window of global enviornment is refleshing and then aborts with such NAs 
  focal_quantile(rr[],3,rrqt[])

Example results 结果示例

 [1]       NA       NA       NA       NA       NA       NA       NA       NA 118918.6 130810.5 142702.3 154594.2 166486.0       NA       NA
[16] 202161.6 214053.5 225945.3 237837.2 249729.1       NA       NA 285404.7 297296.5 309188.4 321080.2 332972.1       NA       NA 368647.7
[31] 380539.5 392431.4 404323.3 416215.1       NA       NA       NA       NA       NA       NA       NA       NA

resulted NA is acceptable as there are not nine values in the windows. 由于窗口中没有九个值,因此所得的NA是可以接受的。 For such example, i change the values of raster rr (with no NA). 对于这样的示例,我更改了栅格rr的值(无NA)。 it works smoothly. 它工作顺利。 when I introduce NA in rr, for example the sixth row of the codes above. 当我在rr中引入NA时,例如上面代码的第六行。 the Global environment window is refreshing and Rstudio aborts. 全局环境窗口正在刷新,并且Rstudio中止。

the session information is 会话信息是

R version 3.3.0 (2016-05-03)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                           LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] Rcpp_0.12.11 raster_2.5-8 sp_1.2-3    

loaded via a namespace (and not attached):
[1] rgdal_1.2-5     tools_3.3.0     grid_3.3.0      lattice_0.20-35

Thank you very much! 非常感谢你!

First of all, you should only be using the #include <Rcpp.h> statement. 首先,您只应使用#include <Rcpp.h>语句。 The other headers you are adding are not needed or already included within Rcpp.h . 您正在添加的其他标头不是必需的,或者已经包含在Rcpp.h


Secondly, the correct way to reference the NA value for NumericVector s within Rcpp is to use the NA_REAL not R's R_NaReal . 其次,在NumericVector引用NumericVectorNA值的正确方法是使用NA_REAL 而不是 R的R_NaReal


Thirdly, you have an out of bounds error. 第三,您有超出范围的错误。 If you switch the parentheses from [] to () you have bounds detection. 如果将括号从[]切换为()可以进行边界检测。 The error on Rcpp 0.12.11 is: Rcpp 0.12.11上的错误是:

"Index out of bounds: [index=3; extent=3]." “索引超出范围:[索引= 3;范围= 3]。”

As a result, this is creating an "Undefined Behavior" (UB) that triggers the crash of RStudio. 结果,这将创建一个触发RStudio崩溃的“未定义行为”(UB)

The problematic line is: 有问题的行是:

xx(q) = xd(j * ncol + k + i); 
^^^^^

Now, you might say this doesn't make sense as the length of xx should never be 3. However, the reason this line is problematic is because you are changing the values that are found in xx when you drop the NA values with: 现在,您可能会说这没有意义,因为xx的长度不应为3。但是,此行有问题的原因是,当您使用以下方法删除NA值时,您正在更改xx的值:

xx = na_omit(xx);

You should really declare a new xy vector if this is the aim or update the constants to ensure the out of bounds error is avoided. 如果这是目标,则应真正声明一个新的xy向量,或者更新常量以确保避免出现越界错误。


Implementation 履行

#include <Rcpp.h>

// [[Rcpp::export]]
Rcpp::NumericVector focal_quantile(Rcpp::NumericVector xd,
                                   int ngbb,
                                   Rcpp::NumericVector sf) {
  //the imges are transfered to vector, ngbb is the size of the window  
  R_len_t i, j, k, q;
  int wrows = ngbb;
  int wcols = ngbb;
  int wn = wrows * wcols;

  int nrow = 6;//the input raste has 6 rows
  int ncol = 7;//the input raste has 7 cols

  int n = nrow * ncol;
  Rcpp::NumericVector xans(n);
  Rcpp::NumericVector xx(wn);

  int wr = floor(wrows / 2);
  int wc = floor(wcols / 2);

  int nwc = ncol - wc - 1;
  int col = 0;

  // first rows
  for (i = 0; i < ncol*wr; i++) {// the first row, the resutl is set as NA as the neighbor does not have nine values   
    xans[i] = NA_REAL; 
  }

  for (i = ncol*wr; i < (ncol * (nrow-wr)); i++) {//start from the second row
    col = i % ncol;
    if ((col < wc) | (col > nwc)) {//the first pixel of the second is also set as NA
      xans[i] = NA_REAL;
    } else {// to get the nine values in the 3*3 windows
      q = 0;
      for (j = -wr; j <= wr; j++) {
        for (k = -wc; k <= wc; k++) {
          xx[q] = xd[j * ncol + k + i]; 
          q++;
        }
      }
      Rcpp::NumericVector xx_subset = na_omit(xx);
      int n_qt = xx_subset.size();
      if (n_qt > 0){//
        xans[i]=sum(xx_subset)+100*sf[i];// here is the calculation, my goal is more complicated than this example
      } else {
        xans[i] = NA_REAL;//NA_REAL
      }

    }
  }

  // last rows
  for (i = ncol * (nrow-wr); i < n; i++) {  
    xans[i] = NA_REAL;
  }
  return(xans);
}

Test case: 测试用例:

library("raster")
rr = raster(nrow=6,ncol=7)## example for Ra
projection(rr) = "+proj=lcc +lat_1=48 +lat_2=33 +lon_0=-100 +ellps=WGS84"
rr[] = (2:43)*10
rrqt = rr/43 ## example for Rb
rr[1:10] = NA 
focal_quantile(rr[],3,rrqt[])

Output: 输出:

 [1]        NA        NA        NA        NA        NA        NA        NA        NA  742.5581  915.8140 1099.0698 1292.3256
[13] 1375.5814        NA        NA 1625.3488 1828.6047 2041.8605 2265.1163 2378.3721        NA        NA 2718.1395 2831.3953
[25] 2944.6512 3057.9070 3171.1628        NA        NA 3510.9302 3624.1860 3737.4419 3850.6977 3963.9535        NA        NA
[37]        NA        NA        NA        NA        NA        NA

Side note 边注

If you look at the code you are trying to translate, note that there is an naonly part followed by na components. 如果您查看要翻译的代码,请注意,只有一个naonly部分,后跟na个组件。 So, the translation is not necessarily 1-1. 因此,翻译不一定是1-1。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM