简体   繁体   English

在给定转移概率矩阵的情况下寻找马尔可夫过程的平稳分布

[英]Finding stationary distribution of a markov process given a transition probability matrix

There has been two threads related to this issue on Stack Overflow: Stack Overflow 上有两个线程与此问题相关:

The above is straightforward, but very expensive.上面的很简单,但是很贵。 If we have a transition matrix of order n , then at each iteration we compute a matrix-matrix multiplication at costs O(n ^ 3) .如果我们有一个n阶转换矩阵,那么在每次迭代时,我们都会以O(n ^ 3)成本计算矩阵-矩阵乘法。

Is there a more efficient way to do this?有没有更有效的方法来做到这一点? One thing that occurs to me is to use Eigen decomposition.我想到的一件事是使用特征分解。 A Markov matrix is known to:已知马尔可夫矩阵:

  • be diagonalizable in complex domain: A = E * D * E^{-1} ;在复域中可对角化: A = E * D * E^{-1}
  • have a real Eigen value of 1, and other (complex) Eigen values with length smaller than 1.具有 1 的实特征值,以及长度小于 1 的其他(复数)特征值。

The stationary distribution is the Eigen vector associated with the Eigen value of 1, ie, the first Eigen vector.平稳分布是与特征值 1 相关联的特征向量,即第一个特征向量。

Well, the theory is nice, but I can't get it work.嗯,理论很好,但我无法让它发挥作用。 Taking the matrix P in the first linked question:在第一个链接问题中取矩阵P

P <- structure(c(0, 0.1, 0, 0, 0, 0, 0, 0.1, 0.2, 0, 0, 0, 0, 0, 0.2, 
0.3, 0, 0, 0.5, 0.4, 0.3, 0.5, 0.4, 0, 0, 0, 0, 0, 0.6, 0.4, 
0.5, 0.4, 0.3, 0.2, 0, 0.6), .Dim = c(6L, 6L))

If I do:如果我做:

Re(eigen(P)$vectors[, 1])
# [1] 0.4082483 0.4082483 0.4082483 0.4082483 0.4082483 0.4082483

What's going on?这是怎么回事? According to previous questions, the stationary distribution is:根据前面的问题,平稳分布是:

# [1] 0.002590673 0.025906737 0.116580322 0.310880848 0.272020713 0.272020708

Well, to use Eigen decomposition, we need to work with t(P) .好吧,要使用特征分解,我们需要使用t(P)

The definition of a transition probability matrix differs between probability / statistics and linear algebra.转移概率矩阵的定义在概率/统计和线性代数之间有所不同。 In statistics all rows of P sum to 1, while in linear algebra, all columns of P sum to 1. So instead of eigen(P) , we need eigen(t(P)) :在统计中, P所有行总和为 1,而在线性代数中, P所有列总和为 1。因此,我们需要eigen(t(P))而不是eigen(P) eigen(t(P))

e <- Re(eigen(t(P))$vectors[, 1])
e / sum(e)
# [1] 0.002590673 0.025906737 0.116580322 0.310880848 0.272020713 0.272020708

As we can see, we've only used the first Eigen vector, ie, the Eigen vector of the largest Eigen value.正如我们所见,我们只使用了第一个特征向量,即最大特征值的特征向量。 Therefore, there is no need to compute all Eigen values / vectors using eigen .因此,无需使用eigen计算所有特征值/向量。 The power method can be used to find an Eigen vector of the largest Eigen value.幂方法可用于找到最大特征值的特征向量。 Let's implement this in R:让我们在 R 中实现它:

stydis1 <- function (A) {
  n <- dim(A)[1L]
  ## checking
  if (any(.rowSums(A, n, n) != 1)) 
    stop (" 'A' is not a Markov matrix")
  ## implement power method
  e <- runif (n)
  oldnorm <- sqrt(c(crossprod(e)))
  repeat {
    e <- crossprod(A, e)
    newnorm <- sqrt(c(crossprod(e)))
    if (abs(newnorm / oldnorm - 1) < 1e-8) break
    e <- e / newnorm
    oldnorm <- newnorm
    }
  ## rescale `e` so that it sums up to 1
  c(e / sum(e))
  }

stydis1 (P)
# [1] 0.002590673 0.025906737 0.116580322 0.310880848 0.272020713 0.272020708

And the result is correct.结果是正确的。


In fact, we don't have to exploit Eigen decomposition.事实上,我们不必利用特征分解。 We can adjust the method used in your second linked question.我们可以调整您在第二个链接问题中使用的方法。 Over there, we took matrix power which is expensive as you commented;在那里,我们采用了矩阵功率,正如您所评论的那样昂贵; but why not re-cast it into a matrix-vector multiplication?但为什么不将其重新转换为矩阵向量乘法呢?

stydis2 <- function (A) {
  n <- dim(A)[1L]
  ## checking
  if (any(.rowSums(A, n, n) != 1)) 
    stop (" 'A' is not a Markov matrix")
  ## direct computation
  b <- A[1, ]
  oldnorm <- sqrt(c(crossprod(b)))
  repeat {
    b <- crossprod(A, b)
    newnorm <- sqrt(c(crossprod(b)))
    if (abs(newnorm / oldnorm - 1) < 1e-8) break
    oldnorm <- newnorm
    }
  ## return stationary distribution
  c(b)
  }

stydis2 (P)
# [1] 0.002590673 0.025906737 0.116580322 0.310880848 0.272020713 0.272020708

We start from an arbitrary initial distribution, say A[1, ] , and iteratively apply transition matrix until the distribution converges.我们从一个任意的初始分布开始,比如A[1, ] ,并迭代地应用转移矩阵直到分布收敛。 Again, the result is correct.再次,结果是正确的。

Your vector y = Re(eigen(P)$vectors[, 1]) is not a distribution (since it doesn't add up to one) and solves P'y = y , not x'P = x .您的向量y = Re(eigen(P)$vectors[, 1])不是分布(因为它加起来不x'P = x一)并且解决P'y = y ,而不是x'P = x The solution from your linked Q&A does approximately solve the latter:您链接的问答中的解决方案大致解决了后者:

x = c(0.00259067357512953, 0.0259067357512953, 0.116580310880829, 
0.310880829015544, 0.272020725388601, 0.272020725388601)
all(abs(x %*% P - x) < 1e-10) # TRUE

By transposing P, you can use your eigenvalue approach:通过转置 P,您可以使用特征值方法:

x2 = Re(eigen(t(P))$vectors[, 1])
x2 <- x2/sum(x2) 
(function(x) all(abs(x %*% P - x) < 1e-10))(
  x2
) # TRUE

It's finding a different stationary vector in this instance, though.不过,在这种情况下,它正在寻找不同的平稳向量。

By the definition of the stationary probability vector, it is a left-eigenvector of the transition probability matrix with unit eigenvalue .根据平稳概率向量的定义,它是转移概率矩阵的左特征向量单位特征值为 We can find objects of this kind by computing the eigendecomposition of the matrix, identifying the unit eigenvalues and then computing the stationary probability vectors for each of these unit eigenvalues.我们可以通过计算矩阵的特征分解,识别单位特征值,然后计算每个单位特征值的平稳概率向量来找到这类对象。 Here is a function in R to do this.这是R一个函数来做到这一点。

stationary <- function(P) {
  
  #Get matrix information
  K     <- nrow(P)
  NAMES <- rownames(P)
  
  #Compute the eigendecomposition
  EIGEN <- eigen(P)
  VALS  <- EIGEN$values
  RVECS <- EIGEN$vectors
  LVECS <- solve(VECS)
  
  #Find the unit eigenvalue(s)
  RES <- zapsmall(Mod(VALS - as.complex(rep(1, K))))
  IND <- which(RES == 0)
  N   <- length(IND)
  
  #Find the stationary vector(s)
  OUT <- matrix(0, nrow = N, ncol = K)
  rownames(OUT) <- sprintf('Stationary[%s]', 1:N)
  colnames(OUT) <- NAMES
  for (i in 1:length(IND)) { 
    SSS     <- Re(eigen(t(P))$vectors[, IND[i]])
    OUT[i,] <- SSS/sum(SSS) }
  
  #Give the output
  OUT }

( Note: The computed eigendecomposition using eigen is subject to some numerical error, so there is no eigenvalue that is exactly equal to one. For this reason we zapsmall the modular deviation from one to identify the unit eigenvector(s). This will give us the correct answer so long as there is no true eigenvalue that is less than one, but so close to one that it also gets "zapped" to one.) 注意:使用eigen计算出的eigen分解会受到一些数值误差的影响,因此没有完全等于 1 的特征值。因此,我们将模偏差从 1 中zapsmall以识别单位特征向量。这将给我们只要不存在小于 1 的真实特征值,但又非常接近 1 以致于它也被“zapped”到 1 时,就是正确的答案。)

Applying this function to your transition probability matrix correctly identifies the unique stationary probability vector in this case.在这种情况下,将此函数应用于您的转移概率矩阵可以正确识别唯一的平稳概率向量。 There is a small amount of numerical error in the computation, but this should be manageable in most cases.计算中存在少量数值误差,但在大多数情况下这应该是可控的。

#Compute the stationary probability vector
S <- stationary(P)

#Show this vector and confirm stationarity
S
                     [,1]       [,2]      [,3]      [,4]      [,5]      [,6]
Stationary[1] 0.002590674 0.02590674 0.1165803 0.3108808 0.2720207 0.2720207

S %*% P
                     [,1]       [,2]      [,3]      [,4]      [,5]      [,6]
Stationary[1] 0.002590674 0.02590674 0.1165803 0.3108808 0.2720207 0.2720207

#Show error in computation
c(S %*% P - S)
[1]  4.336809e-17  2.775558e-17  1.110223e-16 -2.775558e-16  1.665335e-16 -5.551115e-17

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM