简体   繁体   English

添加回归线方程和R2值

[英]Adding regression line equation and R2 value

I am trying to add the regression line equation and the R square value in a dataset with y axe values in logarithmic scale, like in this excel example: 我试图在具有对数刻度的y轴值的数据集中添加回归线方程和R平方值,例如以下excel示例:

这是我想在R中获得的,用excel完成 .

Data frame contains the following data, with 3 variables and 28 obs.: 数据框包含以下数据,其中包含3个变量和28磅。

          Method        void.ratio    permeability_m.s
1      Constant load      1.360         1.82e-05
2      Constant load      1.360         1.79e-05
3      Constant load      1.190         7.74e-06
4      Constant load      1.190         5.15e-06
5      Variable load      1.040         1.57e-06
6      Variable load      1.040         1.71e-06
7      Variable load      1.040         1.57e-06
8      Variable load      1.040         1.71e-06
9      Triaxial test      0.780         3.00e-07
10     Triaxial test      0.780         2.70e-07
11 Oedometric test 1      0.690         1.33e-07
12 Oedometric test 1      0.685         5.84e-08
13 Oedometric test 2      0.697         3.35e-07
14 Oedometric test 2      0.629         2.85e-07
15 Oedometric test 2      0.554         7.75e-08
16 Oedometric test 2      0.526         3.27e-09
17 Oedometric test 2      0.528         4.71e-09
18 Oedometric test 2      0.530         4.72e-09
19 Oedometric test 2      0.534         6.70e-09
20 Oedometric test 3      0.705         1.34e-07
21 Oedometric test 3      0.648         1.23e-07
22 Oedometric test 3      0.574         8.29e-08
23 Oedometric test 3      0.530         8.77e-08

After running the following code I only obtain the regression line, but I am not able to obtain the regression equation and the R square value. 运行以下代码后,我仅获得回归线,但无法获得回归方程式和R平方值。

R code: R代码:

plot_lab_permeability2<- ggplot(Lab_permeability2,aes(void.ratio, permeability_m.s))+
geom_point(size=3,aes(shape = Method, colour = Method))+
geom_smooth(method="lm",formula= (y ~ x), se=FALSE, linetype = 8,color="grey") +
scale_shape_manual("",breaks = c("Constant load","Variable load","Triaxial test","Oedometric test 1","Oedometric test 2","Oedometric test 3"),
                    values=c("Constant load"=15,"Variable load"=17,"Triaxial test"=18,"Oedometric test 1"=16,"Oedometric test 2"=16,"Oedometric test 3"=16))+
scale_colour_manual("",breaks = c("Constant load","Variable load","Triaxial test","Oedometric test 1","Oedometric test 2","Oedometric test 3"),
                    values = c("Constant load"="darkblue","Variable load"="blue","Triaxial test"="darkgreen","Oedometric test 1"="darkred","Oedometric test 2"="red","Oedometric test 3"="orange"))+
scale_y_continuous(limits = c((1e-9),(1e-4)), trans="log10") +
labs(x=expression ("Void ratio (-)"),y = expression ("Saturated hydraulic conductivity (m/s)"),title="") +
theme_bw()

This is the generated plot: 这是生成的图: 用R生成的图

I have been reading similar questions and trying differents approaches, but after hours trying, I am not able to find the solution. 我一直在阅读类似的问题,并尝试不同的方法,但是经过数小时的尝试,我找不到解决方案。

Any help will be highly appreciated. 任何帮助将不胜感激。

You can steal/slightly modify a function used by a SO question asker here : 您可以在此处窃取/轻微修改SO问题提问者使用的功能:

library(ggplot2)

lm_eqn <- function(df, model_fit){
# From a past Stack Overflow question
    eq <- substitute(italic(y) == a + b %.% italic(x)*","~~italic(r)^2~"="~r2, 
                     list(a = format(coef(model_fit)[1], digits = 2), 
                          b = format(coef(model_fit)[2], digits = 2),
                          r2 = format(summary(model_fit)$r.squared, digits = 3)))
    as.character(as.expression(eq));                 
}

# I think you intend to log transform here?
model_fit <- lm(log10(permeability_m.s) ~ void.ratio, Lab_permeability2)

plot_lab_permeability2 <- ggplot(Lab_permeability2, aes(void.ratio, permeability_m.s)) +
    geom_point(size=3,aes(shape = Method, colour = Method))+
    geom_smooth(method="lm",formula= (y ~ x), se=FALSE, linetype = 8,color="grey") +
    scale_shape_manual("",breaks = c("Constant load","Variable load","Triaxial test","Oedometric test 1","Oedometric test 2","Oedometric test 3"),
                       values=c("Constant load"=15,"Variable load"=17,"Triaxial test"=18,"Oedometric test 1"=16,"Oedometric test 2"=16,"Oedometric test 3"=16))+
    scale_colour_manual("",breaks = c("Constant load","Variable load","Triaxial test","Oedometric test 1","Oedometric test 2","Oedometric test 3"),
                        values = c("Constant load"="darkblue","Variable load"="blue","Triaxial test"="darkgreen","Oedometric test 1"="darkred","Oedometric test 2"="red","Oedometric test 3"="orange"))+
    scale_y_continuous(limits = c((1e-9),(1e-4)), trans="log10") +
    labs(x=expression ("Void ratio (-)"),y = expression ("Saturated hydraulic conductivity (m/s)"),title="") +
    geom_text(aes(x = 0.55, y = 0.5e-4, label = lm_eqn(Lab_permeability2, model_fit)),
              size=5, hjust=0, parse = TRUE, check_overlap = TRUE) +
    theme_bw()

plot_lab_permeability2

Result: 结果:

在此处输入图片说明

Raul, @duckmayr has given you a nice solution to your original ask. Raul,@ duckmayr为您的原始问题提供了一个不错的解决方案。 The code he shared does all the labelling as you requested. 他共享的代码将按照您的要求进行所有标记。 But now you're asking a different question. 但是现在您在问一个不同的问题。 You originally asked for the y axis to be in "logarithmic scale". 您最初要求y轴为“对数刻度”。 He gave you a solution that used log10, trivial to change the answer to ln(y) if you prefer. 他为您提供了一个使用log10的解决方案,可以根据需要轻松地将答案更改为ln(y)。 For example: 例如:

Lab_permeability2$lnperm <- log(Lab_permeability2$permeability_m.s)

Then you can fit the regression equation: 然后,您可以拟合回归方程式:

model_fit <- lm(lnperm ~ void.ratio, Lab_permeability2)

Your coefficients will of course not be the same... 您的系数当然会不同...

Call: lm(formula = lnperm ~ void.ratio, data = Lab_permeability2) 调用:lm(公式= lnperm〜void.ratio,数据= Lab_permeability2)

Coefficients: (Intercept) void.ratio 系数:(截距)void.ratio
-21.993 8.406 -21.993 8.406

and you should change the axis to show ln 并且您应该更改轴以显示ln

ggplot(Lab_permeability2, aes(void.ratio, lnperm)) +
  geom_point(size=3,aes(shape = Method, colour = Method))+
  geom_smooth(method="lm",formula= (y ~ x), se=FALSE, linetype = 8,color="grey") +
  scale_shape_manual("",breaks = c("Constant load","Variable load","Triaxial test","Oedometric test 1","Oedometric test 2","Oedometric test 3"),
                 values=c("Constant load"=15,"Variable load"=17,"Triaxial test"=18,"Oedometric test 1"=16,"Oedometric test 2"=16,"Oedometric test 3"=16)) +
  scale_colour_manual("",breaks = c("Constant load","Variable load","Triaxial test","Oedometric test 1","Oedometric test 2","Oedometric test 3"),
                values = c("Constant load"="darkblue","Variable load"="blue","Triaxial test"="darkgreen","Oedometric test 1"="darkred","Oedometric test 2"="red","Oedometric test 3"="orange")) +
#  scale_y_continuous(limits = c((1e-9),(1e-4)), trans="log") +
  labs(x=expression ("Void ratio (-)"),y = expression ("Natural Log Saturated hydraulic conductivity (m/s)"),title="") +
  geom_text(aes(x = 0.55, y = -12, label = lm_eqn(Lab_permeability2, model_fit)),
        size=5, hjust=0, parse = TRUE, check_overlap = TRUE) +
  theme_bw()

This produces 这产生

R图样本

So for a void.ratio value of 1.0 we would expect 8.406 - 21.993 = -13.587 which appears to be what the graph shows... to convert back to the original scale 因此,对于void.ratio值为1.0,我们可以预期8.406-21.993 = -13.587,这似乎是该图显示的...转换回原始比例

exp(8.406 - 21.993)
[1] 1.256727e-06

I'm not clear on why you think it should be 1.34e-6 although for starters you only gave us the first 23 observations of what you said were 28 我不清楚您为什么认为应该是1.34e-6,尽管对于初学者来说,您只给了我们关于您所说的28项的前23个观察结果

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM