[英]Python - hmmlearn - Negative transmat
I'm trying to fit a model with hmmlearn given a transition matrix and an emisison matrix a priori. 我正在尝试使用hmmlearn拟合模型,并给出先验过渡矩阵和emisison矩阵。 After fit, it gives some negative values in the transition matrix.
拟合后,它在过渡矩阵中给出了一些负值。
The transition matrix is recovered by the transition matrix of another model. 过渡矩阵由另一个模型的过渡矩阵恢复。
A code of example of what I'm meaning is: 我的意思的示例代码是:
>>> model
GaussianHMM(algorithm='viterbi', covariance_type='diag',covars_prior=0.01,
covars_weight=1, init_params='stmc', means_prior=0, means_weight=0,
n_components=3, n_iter=100, params='stmc', random_state=123,
startprob_prior=1.0, tol=0.5, transmat_prior=1.0, verbose=True)
>>> model.transmat_
array([[ 9.95946216e-01, 2.06359396e-21, 4.05378401e-03],
[ 2.05184679e-21, 9.98355526e-01, 1.64447392e-03],
[ 3.86689326e-03, 1.96383373e-03, 9.94169273e-01]])
>>> new_model= hmm.GaussianHMM(n_components=model.n_components,
random_state=123,
... init_params="mcs", transmat_prior=model.transmat_)
>>> new_model.fit(train_features)
GaussianHMM(algorithm='viterbi', covariance_type='diag', covars_prior=0.01,
covars_weight=1, init_params='mcs', means_prior=0, means_weight=0,
n_components=3, n_iter=10, params='stmc', random_state=123,
startprob_prior=1.0, tol=0.01,
transmat_prior=array([[ 9.95946e-01, 2.06359e-21, 4.05378e-03],
[ 2.05185e-21, 9.98356e-01, 1.64447e-03],
[ 3.86689e-03, 1.96383e-03, 9.94169e-01]]),
verbose=False)
>>> new_model.transmat_
array([[ 9.98145253e-01, 1.86155258e-03, -7.08313729e-06],
[ 2.16330448e-03, 9.93941859e-01, 3.89483667e-03],
[ -5.44842863e-06, 3.52862069e-03, 9.96478546e-01]])
>>>
In the code shown training data are also the same. 在所示的代码中,训练数据也相同。 If I don't use the transition matrix in priori but the emission, for example, it works correctly.
例如,如果我不在先验中使用过渡矩阵,而是在发射中使用,则它可以正常工作。 I'm using Anaconda 2.5 64-bit.
我正在使用Anaconda 2.5 64位。 hmmlearn version is 0.2.0
hmmlearn版本是0.2.0
Hint? 暗示? Thanks
谢谢
tl;dr ensure transmat_prior
is >=1. tl; dr确保
transmat_prior
> = 1。
EM algorithm for hidden Markov models is derived using state indicator variables z
which hold the state of the Markov chain for each time step t
. 使用状态指示符变量
z
推导用于隐藏马尔可夫模型的EM算法,该变量在每个时间步长t
都保持马尔可夫链的状态。 Conditioned on the previous state z[t - 1]
, z[t]
follows a Categorical distribution with parameters defined by the transition probability matrix. 以先前状态
z[t - 1]
, z[t]
遵循分类分布 ,其参数由过渡概率矩阵定义。
hmmlearn
implements MAP learning of hidden Markov models, which means that each model parameter has a prior distribution. hmmlearn
实现了隐马尔可夫模型的MAP学习,这意味着每个模型参数都具有先验分布。 Specifically, each row of the transition matrix is assumed to follow a symmetric Dirichlet distribution with parameter transmat_prior
. 具体而言,假设过渡矩阵的每一行都遵循参数
transmat_prior
的对称Dirichlet分布 。 The choice of prior is not random, Dirichlet distribution is conjugate to the Categorical. 先验的选择不是随机的,Dirichlet分布与分类共轭。 This gives rise to a simple update rule in the M-step of EM algorithm:
这在EM算法的M步中产生了一个简单的更新规则 :
transmat[i, j] = (transmat_prior[i, j] - 1.0 + stats["trans"][i, j]) / normalizer
where stat["trans"][i, j]
is the expected number of transitions between i
and j
. 其中
stat["trans"][i, j]
是i
和j
之间的期望过渡次数。
From the update rule it's clear that transition probabilities can get negative if a) transmat_prior
is <1 for some i
and j
and b) the expectation stats["trans"]
is not big enough to compensate for this. 根据更新规则,很明显,如果a)对于某些
i
和j
, transmat_prior
<1,并且b)期望stats["trans"]
不足以弥补这一点,则转移概率可以为负。
This is a known issue in MAP estimation of the Categorical distribution and the general advice is to require that transmat_prior
>=1 for all states. 这是MAP估算分类分布中的一个已知问题 ,一般建议是要求所有状态的
transmat_prior
> = 1。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.