Rcpp function not returning desired NumericVector

Question

I am trying to rewrite an R function (Fourier smoothing) in Rcpp for faster computation. My Rcpp function is not returning the desired values.

I have a vector

    x = c(6262, 5862.5, 5463, 5408, 5353, 5687, 5901, 6245, 5864, 5483, 5692, 5708.5, 5054.75, 5072.375, 5090, 5462, 4939, 5248.5, 5558, 5226, 5125, 5006, 4887, 5334.5, 5782, 5501, 5524.5, 5548)

My Rcpp function

cppFunction("
     NumericVector smo(NumericVector x){
        int n = x.size();
        NumericVector realpart1(5);
        NumericVector imagpart1(5);
        NumericVector sm1(n);
        for (int i = 0; i<5; i++){
            double realpart = 0;
            double imagpart = 0;
            for (int j = 0; j<n; j++) {
                realpart = realpart + 0.07142857*x[j]*cos(2 * 3.142857 * (i+1-1) * (j+2)/28);
                imagpart = imagpart + 0.07142857 * x[j] * sin(2 * 3.142857 * (i+1 - 1) * (j+2) /28);
            }
            realpart1[i]=realpart;
            imagpart1[i] = imagpart;
        }

        for (int j = 0; j<n; j++){
         double sm = realpart1[0]/2;

        for (int i=0; i<5; i++){
            sm = sm + realpart1[i]*cos(2 * 3.142857 * (i+1 - 1) * (j+2) / 28) + imagpart1[i]*sin(2 * 3.142857 * (i+1-1) * (j+2) / 28);
        }
         sm1[j] = sm;

         }
        return sm1; 
}
")

Output of the function smo is coming like below

16804.81 16674.97 16518.58 16425.55 16453.36 16594.95 16780.77 16914.47
16922.49 16789.76 16563.30 16324.47 16147.96 16070.53 16083.19 16145.65
16210.29 16241.81 16226.19 16170.64 16099.70 16049.52 16058.45 16152.36
16328.20 16545.64 16736.58 16833.36

If I subtract value 10949.12 from the output of function(smo) I am getting the desired result like below

Desired output

5855.689 5725.846 5569.459 5476.428 5504.237 5645.833 5831.647 5965.351
5973.369 5840.640 5614.181 5375.346 5198.844 5121.412 5134.069 5196.534
5261.174 5292.694 5277.066 5221.517 5150.584 5100.398 5109.330 5203.243
5379.080 5596.524 5787.462 5884.235

The value 10949.12 is the first value of NumericVector realpart1

I am not able to resolve this issue as I am trying Rcpp for the first time. I have checked loops various times, up to the calculation of realpart1 and imagpart1 loop is working fine... There is some problem with the second loop but I am not able to figure out why the value 10949.12 is being added in the output.

I will really appreciate any help in this regard.

equivalent R Code

har = 4
pi = 22/7
realpart1 = c()
imagpart1 = c()
for (p in 1:(har+1)){
    realpart = 0
    imagpart = 0
    for (i in 1:length(x)){
        realpart = realpart + (2 /length(x)) * x[i] * cos(2 * pi * (p - 1) * (i+1) / length(x))
        imagpart = imagpart + (2 / length(x)) * x[i] * sin(2 * pi * (p - 1) * (i+1) / length(x))
    }
    realpart1 = c(realpart1,realpart)
    imagpart1 = c(imagpart1,imagpart)
    #print(realpart)
    #print(imagpart)
}   
sm1 = c()
for (i in 1:length(x)){

    sm = realpart1[1]/2

    for (p in 2:(har+1)){
        sm = sm + realpart1[p]*cos(2 * pi * (p - 1) * (i+1) / length(x))+ imagpart1[p]*sin(2 * pi * (p - 1) * (i+1) / length(x))
    }
    sm1 = c(sm1,sm)
}

Answer 1

There is a difference in the limits of the nested for loop in the second for loop. In R it goes from 2 to 5, while in C++ it goes from 0 to 4. It should go from 1 to 4 in C++ to be comparable with R.

However, you can probably make the R code faster by avoiding dynamically growing vectors inside the loop. In a for loop that is almost never necessary, since you know the size of the resulting vector beforehand and can use eg realpart <- numeric(length = har + 1) and realpart[p] <- ... .

However, in this case one can go even further and formulate the problem in terms of matrices and linear algebra, avoiding the (explicit) loops altogether:

x <- c(6262, 5862.5, 5463, 5408, 5353, 5687, 5901, 6245, 5864, 5483, 5692, 5708.5,
       5054.75, 5072.375, 5090, 5462, 4939, 5248.5, 5558, 5226, 5125, 5006, 4887,
       5334.5, 5782, 5501, 5524.5, 5548)
fourier_smooth <- function(x, har) {
    pi <- 22 / 7 # this should be removed!
    phase <- 2 * pi * outer(seq_len(har + 1) - 1, seq_along(x) + 1) / length(x) 
    real <- 2 / length(x) * cos(phase) %*% x
    imag <- 2 / length(x) * sin(phase) %*% x
    y <- t(cos(phase)) %*% real + t(sin(phase)) %*% imag
    as.numeric(y - real[1]/2)
}
fourier_smooth(x, 4)
#>  [1] 5855.695 5725.852 5569.463 5476.432 5504.240 5645.837 5831.651
#>  [8] 5965.355 5973.373 5840.644 5614.185 5375.350 5198.848 5121.417
#> [15] 5134.073 5196.538 5261.177 5292.697 5277.070 5221.522 5150.588
#> [22] 5100.402 5109.334 5203.247 5379.084 5596.528 5787.468 5884.242

^{Created on 2019-08-13 by the reprex package (v0.3.0)}

Note that I am including the redefinition of pi only to reproduce your desired result. For correct results, the real value of pi should be used.

However, it is even faster to use R's build in FFT:

x <- c(6262, 5862.5, 5463, 5408, 5353, 5687, 5901, 6245, 5864, 5483, 5692, 5708.5,
       5054.75, 5072.375, 5090, 5462, 4939, 5248.5, 5558, 5226, 5125, 5006, 4887,
       5334.5, 5782, 5501, 5524.5, 5548)
fourier_smooth <- function(x, har) {
    phase <- 2 * pi * outer(seq_len(har + 1) - 1, seq_along(x) - 1) / length(x) 
    real <- 2 / length(x) * cos(phase) %*% x
    imag <- 2 / length(x) * sin(phase) %*% x
    y <- t(cos(phase)) %*% real + t(sin(phase)) %*% imag
    as.numeric(y - real[1]/2)
}

fourier_smooth2 <- function(x, har) {
    y <- fft(x, inverse = TRUE) / length(x)
    y[(har+2):(length(x)-har)] <- 0 # filter higher harmonics while keeping the symmetry for real input
    Re(fft(y)) # result is already real
}

bench::mark(fourier_smooth(x, 4), fourier_smooth2(x, 4))[1:5]
#> # A tibble: 2 x 5
#>   expression                 min   median `itr/sec` mem_alloc
#>   <bch:expr>            <bch:tm> <bch:tm>     <dbl> <bch:byt>
#> 1 fourier_smooth(x, 4)   31.66µs  34.97µs    26342.    4.13MB
#> 2 fourier_smooth2(x, 4)   4.82µs   5.49µs   152845.    3.98KB

^{Created on 2019-08-13 by the reprex package (v0.3.0)}

The redefinition of pi was removed to make sure that results are equal.
The filtering is a bit tricky, but I don't know of any function that's specifically tailored for real time-series.

Rcpp function not returning desired NumericVector

Question

1 answers

solution1
3 ACCPTED 2019-08-13 12:03:44

Rcpp function not returning desired NumericVector

Question

1 answers

solution1 3 ACCPTED 2019-08-13 12:03:44

solution1
3 ACCPTED 2019-08-13 12:03:44