简体   繁体   English

如何在 R 中创建置信区间图

[英]How to create confidence Interval plot in R

b

DF1 = read.csv("Nerlove.3.csv",header=TRUE)

head(DF1, n=5)

split = round(nrow(DF1) * 0.60)


train = (DF1[1:split, ])

test = (DF1[(split + 1):nrow(DF1), ])

model = lm(output ~ ., train)

summary(model)

plot(train$cost, train$output, ylab = "Output", xlab = "Cost",main = "....")


abline(model, col=2)

c C

plot(test$cost, test$output, ylab = "Output", xlab = "Cost",main = "....")

model1 = lm(output ~ ., test)

abline(model, col=2)

prediction = predict(model, test)

plot(prediction, main = "....")

abline(model1, col=2)

summary(model1)

d d

library(stats)

X_0 = data.frame(cost = test$cost)

FI_mean = predict(model, newdata = X_0, interval="confidence", level = 0.95)

FI_ind =  predict(model,newdata = X_0, interval = "prediction")

plot(test$cost, test$output, ylab = "Output", xlab = "Cost",main = "....")

abline(model, col=2)

min = test$cost

max = test$cost

newx = seq(min,max)

matlines(newx, FI_mean[,2:3], col = "blue", lty=2)

I need to plot the Confidence interval result I found around the regression line, but I'm getting an error.我需要绘制我在回归线周围找到的置信区间结果,但出现错误。 can anybody please help me to fix this.任何人都可以帮我解决这个问题。 Thanks This is the link for my data.谢谢这是我的数据的链接。 I have edited it and only using the cost and output data in my dataframe 我已经对其进行了编辑,并且仅使用了数据框中的成本和输出数据

You are doing it wrong because you are using all the variables while developing the linear model by the command model = lm(output ~ ., train) .您做错了,因为您在通过命令model = lm(output ~ ., train)开发线性模型时使用了所有变量。 But during plotting, you are using cost vs. output plotting (as in case of b and c) and in case of d, you are trying to predict using only one variable ie cost.但是在绘图期间,您使用的是成本与输出绘图(如 b 和 c 的情况),而在 d 的情况下,您试图仅使用一个变量(即成本)进行预测。 Regression plot should be made between observed output vs. predicted output.应在观察输出与预测输出之间绘制回归图。 For that, you can use the following code为此,您可以使用以下代码

library(lattice)
library(mosaic)
#> Loading required package: dplyr
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
#> Loading required package: ggformula
#> Loading required package: ggplot2
#> Loading required package: ggstance
#> 
#> Attaching package: 'ggstance'
#> The following objects are masked from 'package:ggplot2':
#> 
#>     geom_errorbarh, GeomErrorbarh
#> 
#> New to ggformula?  Try the tutorials: 
#>  learnr::run_tutorial("introduction", package = "ggformula")
#>  learnr::run_tutorial("refining", package = "ggformula")
#> Loading required package: mosaicData
#> Loading required package: Matrix
#> Registered S3 method overwritten by 'mosaic':
#>   method                           from   
#>   fortify.SpatialPolygonsDataFrame ggplot2
#> 
#> The 'mosaic' package masks several functions from core packages in order to add 
#> additional features.  The original behavior of these functions should not be affected by this.
#> 
#> Note: If you use the Matrix package, be sure to load it BEFORE loading mosaic.
#> 
#> Attaching package: 'mosaic'
#> The following object is masked from 'package:Matrix':
#> 
#>     mean
#> The following object is masked from 'package:ggplot2':
#> 
#>     stat
#> The following objects are masked from 'package:dplyr':
#> 
#>     count, do, tally
#> The following objects are masked from 'package:stats':
#> 
#>     binom.test, cor, cor.test, cov, fivenum, IQR, median, prop.test,
#>     quantile, sd, t.test, var
#> The following objects are masked from 'package:base':
#> 
#>     max, mean, min, prod, range, sample, sum
DF1 = read.csv("Nerlove.csv",header=TRUE)

head(DF1, n=5)
#>    cost output   pl     sl  pk     sk   pf     sf
#> 1 0.082      2 2.09 0.3164 183 0.4521 17.9 0.2315
#> 2 0.661      3 2.05 0.2073 174 0.6676 35.1 0.1251
#> 3 0.990      4 2.05 0.2349 171 0.5799 35.1 0.1852
#> 4 0.315      4 1.83 0.1152 166 0.7857 32.2 0.0990
#> 5 0.197      5 2.12 0.2300 233 0.3841 28.6 0.3859

split = round(nrow(DF1) * 0.60)

train = (DF1[1:split, ])

test = (DF1[(split + 1):nrow(DF1), ])

model = lm(output ~ ., train)

summary(model)
#> 
#> Call:
#> lm(formula = output ~ ., data = train)
#> 
#> Residuals:
#>     Min      1Q  Median      3Q     Max 
#> -409.65 -112.20   -4.61   94.20  430.76 
#> 
#> Coefficients:
#>              Estimate Std. Error t value Pr(>|t|)    
#> (Intercept)  1182.296   1861.876   0.635    0.527    
#> cost          146.231      6.127  23.866  < 2e-16 ***
#> pl             14.897     80.142   0.186    0.853    
#> sl          -2471.312   1930.034  -1.280    0.204    
#> pk              1.477      1.011   1.460    0.148    
#> sk           -869.468   1874.585  -0.464    0.644    
#> pf            -13.701      2.365  -5.794 1.08e-07 ***
#> sf           -947.958   1861.490  -0.509    0.612    
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 169.3 on 87 degrees of freedom
#> Multiple R-squared:  0.9111, Adjusted R-squared:  0.904 
#> F-statistic: 127.4 on 7 and 87 DF,  p-value: < 2.2e-16

#Calibration plotting
pred_cal <- predict(model, newdata=train)
df_cal <- data.frame(Observed=train$output, Predicted=pred_cal)

xyplot(Predicted ~ Observed, data = df_cal, pch = 19,  panel=panel.lmbands,
       band.lty = c(conf =2, pred = 1))


#Validation plottig
pred_val <- predict(model, newdata=test)
df_val <- data.frame(Observed=test$output, Predicted=pred_val)

xyplot(Predicted ~ Observed, data = df_val, pch = 19,  panel=panel.lmbands,
       band.lty = c(conf =2, pred = 1))

Created on 2020-01-07 by the reprex package (v0.3.0)reprex 包(v0.3.0) 于 2020 年 1 月 7 日创建

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM