[英]How to perform a non linear regression for my data
I have set of Temperature and Discomfort index value for each temperature data.我为每个温度数据设置了温度和不适指数值。 When I plot a graph between temperature(x axis) and Calculated Discomfort index value( y axis) I get a reversed U-shape curve.当我在温度(x 轴)和计算的不适指数值(y 轴)之间绘制图形时,我得到了一条反向的 U 形曲线。 I want to do non linear regression out of it and convert it into PMML model.我想从中进行非线性回归并将其转换为 PMML 模型。 My aim is to get the predicted discomfort value if I give certain temperature.我的目标是在给出特定温度时获得预测的不适值。
Please find the below dataset :请找到以下数据集:
Temp <- c(0,5,10,6 ,9,13,15,16,20,21,24,26,29,30,32,34,36,38,40,43,44,45, 50,60)
Disc<-c(0.00,0.10,0.25,0.15,0.24,0.26,0.30,0.31,0.40,0.41,0.49,0.50,0.56, 0.80,0.90,1.00,1.00,1.00,0.80,0.50,0.40,0.20,0.15,0.00)
How to do non linear regression (possibly with nls
??) for this dataset?如何对该数据集进行非线性回归(可能使用nls
??)?
I did take a look at this, then I think it is not as simple as using nls
as most of us first thought.我确实看过这个,然后我认为它不像我们大多数人最初认为的那样使用nls
那么简单。
nls
fits a parametric model, but from your data (the scatter plot), it is hard to propose a reasonable model assumption. nls
拟合参数模型,但从您的数据(散点图)来看,很难提出合理的模型假设。 I would suggest using non-parametric smoothing for this.我建议为此使用非参数平滑。
There are many scatter plot smoothing methods, like kernel smoothing ksmooth
, smoothing spline smooth.spline
and LOESS loess
.散点图平滑方法有很多种,如核平滑ksmooth
、平滑样条smooth.spline
和 LOESS loess
。 I prefer to using smooth.spline
, and here is what we can do with it:我更喜欢使用smooth.spline
,这是我们可以用它做的事情:
fit <- smooth.spline(Temp, Disc)
Please read ?smooth.spline
for what it takes and what it returns.请阅读?smooth.spline
了解它需要什么以及它返回什么。 We can check the fitted spline curve by我们可以通过以下方式检查拟合的样条曲线
plot(Temp, Disc)
lines(fit, col = 2)
Should you want to make prediction elsewhere, use predict
function ( predict.smooth.spline
).如果您想在其他地方进行预测,请使用predict
功能( predict.smooth.spline
)。 For example, if we want to predict Temp = 20
and Temp = 44
, we can use例如,如果我们要预测Temp = 20
和Temp = 44
,我们可以使用
predict(fit, c(20,44))$y
# [1] 0.3940963 0.3752191
Prediction outside range(Temp)
is not recommended, as it suffers from potential bad extrapolation effect.不建议在range(Temp)
外range(Temp)
预测,因为它会受到潜在的不良外推效应的影响。
Before I resort to non-parametric method, I also tried non-linear regression with regression splines and orthogonal polynomial basis, but they don't provide satisfying result.在我求助于非参数方法之前,我也尝试过使用回归样条和正交多项式基础的非线性回归,但它们没有提供令人满意的结果。 The major reason is that there is no penalty on the smoothness.主要原因是平滑度没有惩罚。 As an example, I show some try with poly
:作为一个例子,我展示了一些poly
尝试:
try1 <- lm(Disc ~ poly(Temp, degree = 3))
try2 <- lm(Disc ~ poly(Temp, degree = 4))
try3 <- lm(Disc ~ poly(Temp, degree = 5))
plot(Temp, Disc, ylim = c(-0.3,1.0))
x<- seq(min(Temp), max(Temp), length = 50)
newdat <- list(Temp = x)
lines(x, predict(try1, newdat), col = 2)
lines(x, predict(try2, newdat), col = 3)
lines(x, predict(try3, newdat), col = 4)
We can see that the fitted curve is artificial.我们可以看到拟合曲线是人为的。
We can fit polynomials as follows, but it's going to overfit the data as we have higher degree:我们可以按如下方式拟合多项式,但由于我们有更高的阶数,它会过度拟合数据:
m <- nls(Disc ~ a + b*Temp + c*Temp^2 + d*Temp^3 + e*Temp^4, start=list(a=0, b=1, c=1, d=1, e=1))
plot(Temp,Disc,pch=19)
lines(Temp,predict(m),lty=2,col="red",lwd=3)
m <- nls(Disc ~ a + b*Temp + c*Temp^2 + d*Temp^3 + e*Temp^4 + f*Temp^5, start=list(a=0, b=1, c=1, d=1, e=1, f=1))
lines(Temp,predict(m),lty=2,col="blue",lwd=3)
m <- nls(Disc ~ a + b*Temp + c*Temp^2 + d*Temp^3 + e*Temp^4 + f*Temp^5 + g*Temp^6, start=list(a=0, b=1, c=1, d=1, e=1, f=1, g=1))
lines(Temp,predict(m),lty=2,col="green",lwd=3)
m.poly <- lm(Disc ~ poly(Temp, degree = 15))
lines(Temp,predict(m),lty=2,col="yellow",lwd=3)
legend(x = "topleft", legend = c("Deg 4", "Deg 5", "Deg 6", "Deg 20"),
col = c("red", "green", "blue", "yellow"),
lty = 2)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.