我應該更喜歡Rcpp :: NumericVector而不是std :: vector嗎？

Question

有什么理由我更喜歡Rcpp::NumericVector不是std::vector<double> ？

例如，下面的兩個功能

// [[Rcpp::export]]
Rcpp::NumericVector foo(const Rcpp::NumericVector& x) {
  Rcpp::NumericVector tmp(x.length());
  for (int i = 0; i < x.length(); i++)
    tmp[i] = x[i] + 1.0;
  return tmp;
}

// [[Rcpp::export]]
std::vector<double> bar(const std::vector<double>& x) {
  std::vector<double> tmp(x.size());
  for (int i = 0; i < x.size(); i++)
    tmp[i] = x[i] + 1.0;
  return tmp;
}

在考慮其工作和基准性能時是等效的。 我知道Rcpp提供了糖和矢量化操作，但是如果只是將R的向量作為輸入並將向量作為輸出返回，那么我使用哪一個會有什么區別嗎？ 在與R交互時，使用std::vector<double>會導致任何可能出現的問題嗎？

Answer 1

在考慮其工作和基准性能時是等效的。

我懷疑基准測試是否准確，因為從SEXP到std::vector<double>需要從一個數據結構到另一個數據結構的深層復制。 （當我打字時，@ DirkEddelbuettel運行了一個微基准測試。）
Rcpp對象的標記（例如const Rcpp::NumericVector& x ）只是視覺糖。 默認情況下，給定的對象是指針，因此很容易產生波紋修改效果（見下文）。 因此， const std::vector<double>& x不存在真正的匹配，它們有效地“鎖定”和“傳遞引用”。

在與R交互時，使用std::vector<double>會導致任何可能出現的問題嗎？

簡而言之，沒有。 支付的唯一懲罰是對象之間的轉移。

這種轉移的好處是修改分配給另一個NumericVector的NumericVector的值不會導致多米諾骨牌更新。 實質上，每個std::vector<T>都是另一個的直接副本。 因此，以下情況不可能發生：

#include<Rcpp.h>

// [[Rcpp::export]]
void test_copy(){
    NumericVector A = NumericVector::create(1, 2, 3);
    NumericVector B = A;

    Rcout << "Before: " << std::endl << "A: " << A << std::endl << "B: " << B << std::endl; 

    A[1] = 5; // 2 -> 5

    Rcout << "After: " << std::endl << "A: " << A << std::endl << "B: " << B << std::endl; 
}

得到：

test_copy()
# Before: 
# A: 1 2 3
# B: 1 2 3
# After: 
# A: 1 5 3
# B: 1 5 3

有什么理由我更喜歡Rcpp::NumericVector不是std::vector<double> ？

有幾個原因：

如前所述，使用Rcpp::NumericVector避免了C ++ std::vector<T>的深層復制。
您可以訪問糖功能。
能夠在C ++中 “標記” Rcpp對象（例如，通過.attr()添加屬性）

Answer 2

“如果不確定，只需要時間。”

只需將這幾行添加到您已有的文件中：

/*** R
library(microbenchmark)
x <- 1.0* 1:1e7   # make sure it is numeric
microbenchmark(foo(x), bar(x), times=100L)
*/

然后只需調用sourceCpp("...yourfile...")生成以下結果（加上有符號/無符號比較的警告）：

R> library(microbenchmark)

R> x <- 1.0* 1:1e7   # make sure it is numeric

R> microbenchmark(foo(x), bar(x), times=100L)
Unit: milliseconds
   expr     min      lq    mean  median      uq      max neval cld
 foo(x) 31.6496 31.7396 32.3967 31.7806 31.9186  54.3499   100  a 
 bar(x) 50.9229 51.0602 53.5471 51.1811 51.5200 147.4450   100   b
R>

bar()解決方案需要復制以在R內存池中創建R對象。 foo()沒有。 這對於你經常運行多次的 大型向量很重要 。 在這里，我們看到收盤率約為1.8。

在實踐中，如果您喜歡一種編碼風格而不是另一種編碼風格，則可能無關緊要。

我應該更喜歡Rcpp :: NumericVector而不是std :: vector嗎？

問題描述

2 個解決方案

解決方案1
21 已采納 2017-01-11 22:59:00

解決方案2
14 2017-01-11 22:43:51

我應該更喜歡Rcpp :: NumericVector而不是std :: vector嗎？

問題描述

2 個解決方案

解決方案1 21 已采納 2017-01-11 22:59:00

解決方案2 14 2017-01-11 22:43:51

解決方案1
21 已采納 2017-01-11 22:59:00

解決方案2
14 2017-01-11 22:43:51