简体   繁体   English

R 中的 MCMC 变更点 model

[英]MCMC Changepoint model in R

I want to run an MCMC linear Gaussian Multiple Changepoint model to detect changepoints for a time-series vector of continuous values.我想运行 MCMC 线性高斯多重变化点 model 来检测连续值的时间序列向量的变化点。

In doing so, I am thinking of using MCMCregressChange function, but I have several questions here:在这样做时,我正在考虑使用 MCMCregressChange function,但我在这里有几个问题:

(1) How can I obtain log marginal likelihood for these models? (1) 如何获得这些模型的对数边际似然?

(2) What is the difference between MCMCregressChange function and MCMCresidualBreakAnalysis function? (2) MCMCregressChange function 和 MCMCresidualBreakAnalysis function 有什么区别?

R script is shown below. R 脚本如下所示。 I would be very pleased if you could help me solve this issue.如果您能帮我解决这个问题,我将非常高兴。

library(MCMCpack)
set.seed(1234)
n <- 100
x1 <- runif(n, min = 0, max  = 1)
x2 <- runif(n, min = 1, max  = 2)
X <- c(x1,x2)

B0 <- 0.1
sigma.mu=sd(X)
sigma.var=var(X)

model0 <- MCMCregressChange(X ~ 1, m=0, b0=mean(X),   mcmc=100, burnin=100, verbose = 1000,
                                   sigma.mu=sigma.mu, sigma.var=sigma.var, marginal.likelihood="Chib95")
model1 <- MCMCregressChange(X ~ 1, m=1, b0=mean(X),   mcmc=100, burnin=100, verbose = 1000,
                                   sigma.mu=sigma.mu, sigma.var=sigma.var, marginal.likelihood="Chib95")
model2 <- MCMCregressChange(X ~ 1, m=2, b0=mean(X),   mcmc=100, burnin=100, verbose = 1000,
                                   sigma.mu=sigma.mu, sigma.var=sigma.var, marginal.likelihood="Chib95")

print(BayesFactor(model0, model1, model2))

plotState(model0)
plotChangepoint(model0)

plotState(model1)
plotChangepoint(model1)

plotState(model2)
plotChangepoint(model2)


The " Value " subsection of the documentation describes what is returned by MCMCregressChange , stating that the log-marginal likelihood of the model is stored in the attribute logmarglike . 文档的“”小节描述了MCMCregressChange返回的内容,指出 model 的对数边际可能性存储在属性logmarglike中。 Hence, it could be accessed like因此,它可以像访问

attr(model1, "logmarglike")

These attribute values are also reported when running the line in the code:在代码中运行该行时也会报告这些属性值:

print(BayesFactor(model0, model1, model2))

As for the difference in the models, the MCMCresidualBreakAnalysis is a special case of the MCMCregressChange , namely when the X is univariate.至于模型的差异, MCMCresidualBreakAnalysisMCMCregressChange的一个特例,即当X是单变量时。 In fact, the code for MCMCregressChange checks if the number of columns in X is one, and if so reformats the input arguments to be a call to MCMCresidualBreakAnalysis .事实上, MCMCregressChange的代码检查X中的列数是否为 1,如果是,则将输入 arguments 重新格式化为对MCMCresidualBreakAnalysis的调用。 Since there are also no additional parameters specific to the latter, knowing MCMCregressChange is more general and all one should need to use.由于后者也没有特定的附加参数,因此了解MCMCregressChange更为通用,并且都需要使用。

Reinforcing this is a note in the MCMCresidualBreakAnalysis description:加强这一点的是MCMCresidualBreakAnalysis描述中的注释:

" The code is written mainly for an internal use in testpanelSubjectBreak . " 代码主要是为testpanelSubjectBreak的内部使用而编写的。

That is, while it is an exported function, it is primarily a convenience function arising from a specific use case.也就是说,虽然它是一个导出的 function,但它主要是由特定用例产生的方便 function。

In addition to MCMCpack, I think some Bayesian models designed particularly for changepoint detection may be useful.除了 MCMCpack,我认为一些专门为变化点检测设计的贝叶斯模型可能会有用。 In R, three possible packages are bcp , mcp , and Rbeast .在 R 中,三个可能的包是bcpmcpRbeast bcp and mcp are more versatile in terms of model fitted and data types handled. bcpmcp在 model 拟合和数据类型处理方面更加通用。 Rbeast is a method specifically for simultaneous Bayesian time series decomposition (similar to stl ) and changepoint detection (similar to changepoint ); Rbeast是一种专门用于同时贝叶斯时间序列分解(类似于stl )和变化点检测(类似于changepoint )的方法; Rbeast also reports posterior log marginal likelihood that can be used to compare alternative hypotheses on changepoints (speaking more precisely, for alternative priors on changepoint numbers). Rbeast还报告了后验对数边际似然性,可用于比较变化点上的替代假设(更准确地说,对于变化点数的替代先验)。

Below are some quick results for your sample data with bcp and Rbeast .以下是使用bcpRbeast的示例数据的一些快速结果。

set.seed(1234)
n  = 100
x1 = runif(n, min = 0, max  = 1)
x2 = runif(n, min = 1, max  = 2)
X  = c(x1,x2)

library(bcp)
fit=bcp(X)
plot(X)

在此处输入图像描述 On average, bcp pinpoints roughly 2 changepoints;平均而言, bcp查明大约 2 个变化点; the best locations are indicated by the peaks in the probability curve.最佳位置由概率曲线中的峰值指示。 It can be also obtained from sum(fit$posterior.prob,na.rm = TRUE) .它也可以从sum(fit$posterior.prob,na.rm = TRUE)获得。 I believe bcp here fitted piecewise constant models;我相信这里的bcp适合分段常数模型; the mean curve plotted above is an average of the many MCMC-sampled piecewise constant models, which gives the irregularity around the detected changepoint(s).上面绘制的平均曲线是许多 MCMC 采样分段常数模型的平均值,它给出了检测到的变化点周围的不规则性。

As a time series decomposition model, Rbeast fits a time series in the form of "Y=seasonal/periodic (if present) + trend + error".作为时间序列分解model, Rbeast以“Y=季节性/周期性(如果存在)+趋势+误差”的形式拟合一个时间序列。 The seasonal and trend components are modelled as piecewise harmonic curves and piecewise linear (polynomials) curves, respectively.季节性和趋势分量分别建模为分段谐波曲线和分段线性(多项式)曲线。 Given that there is no periodic component in the sample data, season='none' is used in the following code to fit a trend-only model.鉴于样本数据中没有周期性分量,因此在以下代码中使用season='none'来拟合仅趋势 model。 Also, as a changepoint model, Rbeast allows users to specify the range of possible numbers of changepoints;此外,作为变更点 model, Rbeast允许用户指定变更点的可能数量范围; if the minimum and maximum numbers of changepoints allowed are the same and Rbeast will fix the number of changepoints to be a constant;如果允许的最小和最大变更点数相同,并且Rbeast会将变更点数固定为常数; for example, tcp.minmax=c(0,0) specifies the trend has NO changepoint.例如, tcp.minmax=c(0,0)指定趋势没有变化点。 For each piecewise trend/segment, the polynomial order can be set to a range of min and max orders allowed.对于每个分段趋势/分段,多项式阶数可以设置为允许的最小和最大阶数范围。 Below, we fix the min and max orders to zero so that we fit each segment as a constant line (ie, torder.minmax=c(0,0)).下面,我们将最小和最大阶数固定为零,以便将每个段拟合为一条恒定线(即,torder.minmax=c(0,0))。

library(Rbeast)

model0 = beast(X, season='none', tcp.minmax = c(0,0), torder.minmax = c(0,0) ) # no changepoint
model1 = beast(X, season='none', tcp.minmax = c(1,1), torder.minmax = c(0,0) ) # 1 changepoint
model2 = beast(X, season='none', tcp.minmax = c(2,2), torder.minmax = c(0,0) ) # 2 changepoints

plot(model0)
plot(model1)
plot(model2)

# These are the posterior log marginal likelihoods; the numbers will vary slightly
# across runs due to the MCMC nature.
model0$marg_lik    #   -460.6778
model1$marg_lik    #   -313.9160 (the most likely)
model2$marg_lik)   #   -315.8801 

Below is the plot of model0 with no changepoint assumed:下面是model0的model0假设没有变化点: 在此处输入图像描述

Below is the plot of model1 with only 1 changepoint specified.下面是 model1 的model1 ,仅指定了 1 个更改点。 Note that in the prior, we just specify the numeber of changepoints to be 1 Rbeast will still find out the most likely location;请注意,在前面,我们只是将变更点的数量指定为 1 Rbeast仍然会找出最可能的位置; it estimates its occurrence probability over time (ie, the greeen Pr(tcp) cuvrve) plus identify the most likely location (ie, the vertical dashed line).它估计它随时间的发生概率(即绿色 Pr(tcp) 曲线),并确定最可能的位置(即垂直虚线)。 The order_t curves depicts the average order of the polynomial curves needed to adequately fit the trend; order_t 曲线描绘了充分拟合趋势所需的多项式曲线的平均阶数; here it is a zero line because we fix it to zero (ie, torder.minmax=c(0,0)).这里它是一条零线,因为我们将它固定为零(即,torder.minmax=c(0,0))。

在此处输入图像描述

Below is a plot of model2 with 2 changepoints assumed.下面是模型 2 的model2 ,假设有 2 个变化点。 The estimated changepoint probability is more or less the same as the bcp result.估计的变化点概率与bcp结果大致相同。 In practice, the most reasonable way to run Rbeast is not to specify a strong prior by fixing the number of changepoints to a known constant but rather to specify a wide range and let the model figure out the numbers and locations.在实践中,运行Rbeast最合理的方法不是通过将变化点的数量固定为已知常数来指定强先验,而是指定一个宽范围并让 model 找出数字和位置。

在此处输入图像描述

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM