简体   繁体   English

回归模型,置信区间和数据如何绘制?

[英]How are the regression models, confidence intervals and data plotted?

I have the following model. 我有以下模型。 The data file is in: https://drive.google.com/open?id=1_H6YZbdesK7pk5H23mZtp5KhVRKz0Ozl 数据文件位于: https : //drive.google.com/open?id=1_H6YZbdesK7pk5H23mZtp5KhVRKz0Ozl

library(nlme)
library(lme4)
library(car)
library(carData)
library(emmeans)
library(ggplot2)
library(Matrix)
library(multcompView)
datos_weight <- read.csv2("D:/investigacion/publicaciones/articulos-escribiendo/pennisetum/pennisetum-agronomicas/data_weight.csv",header=T, sep = ";", dec = ",")

parte_fija_3 <- formula(weight_DM 
                    ~ Genotypes 
                    + Age 
                    + I(Age^2) 
                    + Genotypes*Age 
                    + Genotypes*I(Age^2))
heterocedasticidad_5 <- varComb(varExp(form = ~fitted(.)))
correlacion_4 <- corCompSymm(form = ~ 1|Block/Genotypes)

modelo_43 <- gls(parte_fija_3, 
             weights = heterocedasticidad_5, 
             correlation = correlacion_4, 
             na.action = na.omit, 
             data = datos_weight)
anova(modelo_43)

#response
Denom. DF: 48 
                   numDF  F-value p-value
(Intercept)            1 597.3828  <.0001
Genotypes              3   2.9416  0.0424
Age                    1 471.6933  <.0001
I(Age^2)               1  22.7748  <.0001
Genotypes:Age          3   5.9425  0.0016
Genotypes:I(Age^2)     3   0.7544  0.5253

Now I want to graph the regression models with confidence intervals and data, separated for each genotype. 现在,我想用每个基因型分开的置信区间和数据来绘制回归模型。 I have used ggplot2 and I have plot the data, I have not been able to add the regression models with confidence intervals. 我使用了ggplot2并绘制了数据,但我无法添加具有置信区间的回归模型。

library(ggplot2)
rango_X <- c(30,90) #x axis
rango_Y <- c(0,175) #y axis
ggplot(datos_weight, aes(x = Age, y = weight_DM)) +
  geom_point() + 
  xlab("Age") + 
  ylab("Dry matter") +
  xlim(rango_X) +
  ylim(rango_Y) +
  facet_wrap(~ Genotypes, ncol = 2)

The graph is as follows: 图形如下:

在此处输入图片说明

For the next analysis of the same data, where there is no interaction with the quadratic age: Genotypes*I(Age^2) , how would you add the regression models with confidence intervals to the graph? 对于同一数据的下一个分析,在与二次年龄没有相互作用的地方: Genotypes*I(Age^2) ,您如何将具有置信区间的回归模型添加到图中?

parte_fija_3 <- formula(weight_DM 
                        ~ Genotypes 
                        + Age
                        + I(Age^2)
                        + Genotypes*Age) 
                        #+ Genotypes*I(Age^2))
> anova(modelo_44)
Denom. DF: 51 
              numDF  F-value p-value
(Intercept)       1 609.3684  <.0001
Genotypes         3   3.7264  0.0169
Age               1 479.0973  <.0001
I(Age^2)          1  21.9232  <.0001
Genotypes:Age     3   6.4184  0.0009

The linear slopes from modelo_44 are: 来自modelo_44的线性斜率是:

(tendencias_em_lin <- emtrends(modelo_44,
                                "Genotypes",
                                var = "Age"))
Genotypes Age.trend        SE df lower.CL upper.CL
C          1.613619 0.1723451 51 1.267622 1.959616
E          1.665132 0.2024104 51 1.258776 2.071488
K          1.888587 0.2001627 51 1.486744 2.290430
M          1.059897 0.1205392 51 0.817905 1.301890

The quadratic slopes are? 二次斜率是?

(tendencias_em_quad <- emtrends(modelo_44,
                                "Genotypes",
                                var = "I(Age^2)"))
 Genotypes I(Age^2).trend           SE df    lower.CL   upper.CL
 C            0.013379926 0.0014290639 51 0.010510961 0.01624889
 E            0.013807066 0.0016783618 51 0.010437614 0.01717652
 K            0.015659927 0.0016597235 51 0.012327893 0.01899196
 M            0.008788536 0.0009994958 51 0.006781965 0.01079511

Confidence level used: 0.95 

Or the stimate from summary: I(Age^2) = -0.01511 ? 还是总结中的结果: I(Age^2) = -0.01511 I believe that the slope is constant for all genotypes because the Genotypes*I(Age^2) interaction has not been tested in modelo_44 : 我相信斜率对于所有基因型都是恒定的,因为尚未在modelo_44测试Genotypes*I(Age^2)相互作用:

summary(modelo_44)
Generalized least squares fit by REML
Model: parte_fija_3 
....
Coefficients:
                   Value Std.Error   t-value p-value
(Intercept)    -73.32555 11.236777 -6.525496  0.0000
GenotypesE       7.22267  9.581979  0.753776  0.4544
GenotypesK      -9.83285  9.165962 -1.072757  0.2884
GenotypesM      17.87000  8.085229  2.210203  0.0316
Age              3.43593  0.450041  7.634687  0.0000
I(Age^2)        -0.01511  0.004065 -3.717475  0.0005
GenotypesE:Age   0.05151  0.246724  0.208788  0.8354
GenotypesK:Age   0.27497  0.241923  1.136595  0.2610
GenotypesM:Age  -0.55372  0.195398 -2.833808  0.0066
...

Questions 问题

  1. How do I add the regression models with confidence intervals and data for each genotype in separate graphs such as those presented, with ggplot2 or another option, if I had to plot for the models: modelo_43 and modelo_44 ? 如果我必须为模型: modelo_43modelo_44进行绘图,如何将带有置信区间和每种基因型数据的回归模型添加到单独的图中(如所示的图中),使用ggplot2或其他选项,如何添加这些模型的置信区间和数据?
  2. Did I correctly calculate the estimate of the quadratic slope with emtrends for modelo_44 , how is it correct? 我有没有正确计算与二次斜率的估计emtrendsmodelo_44 ,它是如何纠正?

Thank you very much for the reply 非常感谢你的回复

This question looks vaguely familiar -- have you posted it before? 这个问题看起来模棱两可-您以前曾张贴过吗? Maybe I'm thinking of some similar question from somebody else. 也许我在想别人的类似问题。

It seems that you are trying to plot apples, oranges, and bananas all on the same scale. 似乎您正在尝试以相同比例绘制苹果,橙子和香蕉。 I'm not sure what units the response variable (dry matter is in); 我不确定响应变量(干物质在哪里)的单位; let's say its in kg. 假设以千克为单位。 Then the results in tendencies_em_lin are in kg per year, and those in tendencies_em_quad are in kg per year^2. 然后, tendencies_em_lin中的结果以千克/年为单位,而tendencies_em_quad中的结果以千克/年^ 2为单位。 These are three different scales, and it makes no sense to "show the data" on plots of those. 这是三种不同的比例,因此在这些图上“显示数据”是没有意义的。

What I think it does make sense to do is something like this: 我认为这样做确实有意义,例如:

emm <- emmeans(modelo_44, ~ Genotype*Age,
    at = list(Age = seq(from = 40, to = 80, by = 5)))

This will obtain predictions for the given ages with each genotype. 这将获得每种基因型对给定年龄的预测。 Now you can plot them as follows: 现在,您可以按以下方式绘制它们:

plotobj <- emmip(emm, Genotype ~ Age, CIs = TRUE)
plotobj

The returned plotobj is a ggplot object that you can add the data to, using techniques as shown in an example at the end of the graphics section in https://cran.r-project.org/web/packages/emmeans/vignettes/basics.html#plots . 返回的plotobj是一个ggplot对象,您可以使用https://cran.r-project.org/web/packages/emmeans/vignettes/中图形部分结尾处的示例所示的技术将数据添加到该对象。 basics.html#plots

Or, you can use dat = as.data.frame(emm) as a data frame containing the results you need, and plot them however you like. 或者,您可以使用dat = as.data.frame(emm)作为包含所需结果的数据框,并根据需要绘制它们。 Again, you may use ggplot2 techniques to add the observed data to these plots. 同样,您可以使用ggplot2技术将观察到的数据添加到这些图中。

Either way, the linear trends will be visible as increases or decreases in the plotted EMMs, and the quadratic trends will be visible as curvature in these paths. 无论哪种方式,在绘制的EMM中线性趋势都将随着增加或减少而可见,而在这些路径中,二次趋势将随着曲率而可见。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM