在R中绘制给定转移矩阵的马尔可夫链

Question

令trans_m为一阶马尔可夫链的n × n转移矩阵。 在我的问题中， n很大，比如10,000，矩阵trans_m是一个由Matrix包构造的稀疏矩阵。 否则， trans_m的大小将是巨大的。 我的目标是在给定初始状态s1和该转移矩阵trans_m的向量的情况下模拟马尔可夫链的序列。 考虑以下具体示例。

    n <- 5000 # there are 5,000 states in this case.
    trans_m <- Matrix(0, nr = n, nc = n, sparse = TRUE)
    K <- 5 # the maximal number of states that could be reached.
    for(i in 1:n){
        states_reachable <- sample(1:n, size = K) # randomly pick K states that can be reached with equal probability.
        trans_m[i, states_reachable] <- 1/K
    }
    s1 <- sample(1:n, size = 1000, replace = TRUE) # generate 1000 inital states
    draw_next <- function(s) {
        .s <- sample(1:n, size = 1, prob = trans_m[s, ]) # given the current state s, draw the next state .s
        .s
    }
    sapply(s1, draw_next)

给定初始状态s1的向量如上所述，我使用sapply(s1, draw_next)来绘制下一个状态。 当n越大时， sapply变慢。 有没有更好的办法？

Answer 1

按行重复索引可能很慢，因此更快地处理转换矩阵的转置并使用列索引，并从内部函数中分解出索引：

R>    trans_m_t <- t(trans_m)
R>
R>    require(microbenchmark)
R>    microbenchmark(
+       apply(trans_m_t[,s1], 2,sample, x=n, size=1, replace=F)
+     ,
+       sapply(s1, draw_next)
+     )
Unit: milliseconds
                                                            expr        min
 apply(trans_m_t[, s1], 2, sample, x = n, size = 1, replace = F) 111.828814
                                           sapply(s1, draw_next) 499.255402
          lq        mean      median          uq        max neval
 193.1139810 190.4379185 194.6563380 196.4273105 270.418189   100
 503.7398805 512.0849013 506.9467125 516.6082480 586.762573   100

由于您已经在使用稀疏矩阵，因此可以通过直接使用三元组来获得更好的性能。 使用更高级别的矩阵运算符可以触发重新压缩。

在R中绘制给定转移矩阵的马尔可夫链

问题描述

1 个解决方案

解决方案1
1 已采纳 2015-07-19 17:26:30

在R中绘制给定转移矩阵的马尔可夫链

问题描述

1 个解决方案

解决方案1 1 已采纳 2015-07-19 17:26:30

解决方案1
1 已采纳 2015-07-19 17:26:30