[英]lsmeans for a piecewise linear mixed-effects model on r
I originally posted this question on Cross Validated Stackexchange, and got no answer.我最初在 Cross Validated Stackexchange 上发布了这个问题,但没有得到答案。 Therefore I decided to give it a go here.
因此我决定在这里给它一个 go。 I am trying to figure out how to obtain lsmeans for a piecewise linear mixed-effects model (fitted with the nlme package) with random intercepts and slopes.
我试图弄清楚如何获得具有随机截距和斜率的分段线性混合效应 model(配备 nlme 包)的 lsmeans。 My data represent math scores from a group of male and female students taking a test every week before and after the introduction of a daily routine of meditation.
我的数据代表了一组男性和女性学生在引入日常冥想之前和之后每周参加考试的数学成绩。 A minimal reproducible example to create the data frame and fit the model is as follows:
创建数据框并适合 model 的最小可重现示例如下:
library(nlme)
library("lsmeans")
# Subject's ID
ID <- c(1,1,1,
2,2,2,
3,3,3,
4,4,4,
5,5,5,
6,6,6,
7,7,7,
8,8,8)
# Time (weeks) before introduction of routine
time1 <- c(-1,0,0,
-1,0,0,
-1,0,0,
-1,0,0,
-1,0,0,
-1,0,0,
-1,0,0,
-1,0,0)
# Time (weeks) before introduction of routine
time2 <- c(0,0,1,
0,0,1,
0,0,1,
0,0,1,
0,0,1,
0,0,1,
0,0,1,
0,0,1)
# week test math scores
mscore <- c(80,92,73,
75,80,85,
60,75,70,
75,80,75,
78,84,75,
78,91,95,
64,72,71,
84,92,70)
# create dataframe
longdata <-data.frame(ID, time1, time2, mscore)
head(longdata)
# fit model
pwmodel <- lme(mscore ~ time1+time2,
random =~ time1+time2|ID,
data=longdata,
method="ML")
# calculate marginal means:
#with both variables
lsmeans(pwmodel, ~(time1+time2),
at=list(time1=c(-1,0,1), time2=c(-1,0,1)) )
# only with one variable
lsmeans(pwmodel, ~time1,
at=list(time1=c(-1,0,1) ))
Here time1 and time2 represent the time before and after the start of the daily meditation routine.这里的 time1 和 time2 代表每日冥想程序开始之前和之后的时间。
The question is: what is the correct way to obtain lsmeans (or emmeans if that is better) from this model at times -1, 0 and 1?问题是:在 -1、0 和 1 时从这个 model 获取 lsmeans(或 emmeans,如果更好的话)的正确方法是什么? Considering the two time variables or only one of them (either time1 or time2)?
考虑两个时间变量还是仅考虑其中一个(time1 或 time2)?
The outputs of both approaches are shown below:两种方法的输出如下所示:
> #with both variables
> lsmeans(pwmodel, ~(time1+time2),
+ at=list(time1=c(-1,0,1), time2=c(-1,0,1)) )
time1 time2 lsmean SE df lower.CL upper.CL
-1 -1 80.8 5.46 7 67.8 93.7
0 -1 89.8 5.47 7 76.8 102.7
1 -1 98.8 5.81 7 85.0 112.5
-1 0 74.2 2.88 7 67.4 81.1
0 0 83.2 2.77 7 76.7 89.8
1 0 92.2 3.28 7 84.5 100.0
-1 1 67.8 3.34 7 59.9 75.6
0 1 76.8 3.12 7 69.4 84.1
1 1 85.8 3.47 7 77.5 94.0
Degrees-of-freedom method: containment
Confidence level used: 0.95
> # only with one variable
> lsmeans(pwmodel, ~time1,
+ at=list(time1=c(-1,0,1) ))
time1 lsmean SE df lower.CL upper.CL
-1 71 2.59 7 64.9 77.1
0 80 2.38 7 74.4 85.6
1 89 2.89 7 82.2 95.8
Results are averaged over the levels of: time2
Degrees-of-freedom method: containment
Confidence level used: 0.95
They clearly return different results, but shouldn't both ways give the same value?它们显然返回不同的结果,但两种方式不应该给出相同的值吗?
You have a very awkward parameterization, in that there is really only one concept of "time", not two.你有一个非常尴尬的参数化,因为实际上只有一个“时间”概念,而不是两个。 I suggest defining a dataset with only the real
time
variable, and fitting a model that accounts for the break point at 0:我建议定义一个只有
time
变量的数据集,并拟合一个 model 来解释 0 处的断点:
library(nlme)
library(emmeans)
dat <- data.frame(
ID = rep(1:8, each = 3),
time = rep(-1:1, 8),
mscore = c(80,92,73,
75,80,85,
60,75,70,
75,80,75,
78,84,75,
78,91,95,
64,72,71,
84,92,70)
)
mod <- lme(mscore ~ time:(1 + (time > 0)), ~ time|ID, data = dat)
lsmeans(mod, "time", cov.reduce = FALSE)
## time lsmean SE df lower.CL upper.CL
## -1 74.2 3.20 7 66.7 81.8
## 0 83.2 2.68 7 76.9 89.6
## 1 76.8 2.88 7 70.0 83.5
##
## Degrees-of-freedom method: containment
## Confidence level used: 0.95
emmip(mod, ~ time, at = list(time = seq(-1, 1, by = .25)))
Created on 2021-11-20 by the reprex package (v2.0.0)由代表 package (v2.0.0) 于 2021 年 11 月 20 日创建
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.