R 中的 Collatz 猜想

Question

我仍然主要為我自己（和我的學生）教授一些 R 語言。

這是在 R 中 Collatz 序列的實現：

f <- function(n)
{
    # construct the entire Collatz path starting from n
    if (n==1) return(1)
    if (n %% 2 == 0) return(c(n, f(n/2)))
    return(c(n, f(3*n + 1)))
}

調用 f(13) 我得到 13, 40, 20, 10, 5, 16, 8, 4, 2, 1

但是請注意，這里向量的大小是動態增長的。 這種移動往往會導致代碼效率低下。 有沒有更高效的版本？

在 Python 中，我會使用

def collatz(n):
    assert isinstance(n, int)
    assert n >= 1

    def __colla(n):

        while n > 1:
            yield n

            if n % 2 == 0:
                n = int(n / 2)
            else:
                n = int(3 * n + 1)

        yield 1

    return list([x for x in __colla(n)])

我找到了一種寫入向量的方法，而無需先驗地指定它們的維度。 因此一個解決方案可能是

collatz <-function(n)
{
  stopifnot(n >= 1)  
  # define a vector without specifying the length
  x = c()

  i = 1
  while (n > 1)
  {
    x[i] = n
    i = i + 1
    n = ifelse(n %% 2, 3*n + 1, n/2)
  }
  x[i] = 1
  # now "cut" the vector
  dim(x) = c(i)
  return(x)
}

Answer 1

我很好奇通過Rcpp的 C++ 實現將如何與您的兩種基本 R 方法進行比較。 這是我的結果。

首先讓我們定義一個函數collatz_Rcpp ，它返回給定整數n的冰雹序列。 （非遞歸）實現改編自Rosetta Code 。

library(Rcpp)
cppFunction("
    std::vector<int> collatz_Rcpp(int i) {
        std::vector<int> v;
        while(true) {
            v.push_back(i);
            if (i == 1) break;
            i = (i % 2) ? (3 * i + 1) : (i / 2);
        }
        return v;
    }
")

現在我們運行一個microbenchmark同時使用您的基礎R和分析Rcpp實現。 我們計算前 10000 個整數的冰雹序列

# base R implementation
collatz_R <- function(n) {
    # construct the entire Collatz path starting from n
    if (n==1) return(1)
    if (n %% 2 == 0) return(c(n, collatz(n/2)))
    return(c(n, collatz(3*n + 1)))
}

# "updated" base R implementation
collatz_R_updated <-function(n) {
  stopifnot(n >= 1)
  # define a vector without specifying the length
  x = c()
  i = 1
  while (n > 1) {
    x[i] = n
    i = i + 1
    n = ifelse(n %% 2, 3*n + 1, n/2)
  }
  x[i] = 1
  # now "cut" the vector
  dim(x) = c(i)
  return(x)
}

library(microbenchmark)
n <- 10000
res <- microbenchmark(
    baseR = sapply(1:n, collatz_R),
    baseR_updated = sapply(1:n, collatz_R_updated),
    Rcpp = sapply(1:n, collatz_Rcpp))

res
#         expr        min         lq       mean     median         uq       max
#        baseR   65.68623   73.56471   81.42989   77.46592   83.87024  193.2609
#baseR_updated 3861.99336 3997.45091 4240.30315 4122.88577 4348.97153 5463.7787
#         Rcpp   36.52132   46.06178   51.61129   49.27667   53.10080  168.9824

library(ggplot2)
autoplot(res)

（非遞歸） Rcpp實現似乎比原始（遞歸）基礎 R 實現快 30% 左右。 在“更新”（非遞歸）基礎R執行顯著慢於原來的（遞歸）基礎R方法（該是microbenchmark需要大約10分鍾，我的MacBook Air上完成，由於baseR_updated ）。

R 中的 Collatz 猜想

問題描述

1 個解決方案

解決方案1
3 已采納 2018-10-03 23:41:25

R 中的 Collat​​z 猜想

問題描述

1 個解決方案

解決方案1 3 已采納 2018-10-03 23:41:25

R 中的 Collatz 猜想

解決方案1
3 已采納 2018-10-03 23:41:25