简体   繁体   中英

Sine curve fit using lm and nls in R

I am a beginner in curve fitting and several posts on Stackoverflow really helped me.

I tried to fit a sine curve to my data using lm and nls but both methods show a strange fit as shown below. Could anyone point out where I went wrong. I would suspect something to do with time but could not get it right. My data can be accessed from here . 情节

data <- read.table(file="900days.txt", header=TRUE, sep="")
time<-data$time
temperature<-data$temperature

#lm fitting
xc<-cos(2*pi*time/366)
xs<-sin(2*pi*time/366)
fit.lm<-lm(temperature~xc+xs)
summary(fit.lm)
plot(temp~time, data=data, xlim=c(1, 900))
par(new=TRUE)
plot(fit.lm$fitted, type="l", col="red", xlim=c(1, 900), pch=19, ann=FALSE, xaxt="n",
yaxt="n")

#nls fitting
fit.nls<-nls(temp~C+alpha*sin(W*time+phi),
   start=list(C=27.63415, alpha=27.886, W=0.0652, phi=14.9286))
summary(fit.nls)
plot(fit.nls$fitted, type="l", col="red", xlim=c(1, 900), pch=19, ann=FALSE, xaxt="n", 
axt="n")

This is because the NA values are removed from the data to be fit (and your data has quite a few of them); hence, when you plot fit.lm$fitted the plot method is interpreting the index of that series as the 'x' values to plot it against.

Try this [note how I've changed variable names to prevent conflicts with the functions time and data (read this post)]:

Data <- read.table(file="900days.txt", header=TRUE, sep="")
Time <- Data$time 
temperature <- Data$temperature

xc<-cos(2*pi*Time/366)
xs<-sin(2*pi*Time/366)
fit.lm <- lm(temperature~xc+xs)

# access the fitted series (for plotting)
fit <- fitted(fit.lm)  

# find predictions for original time series
pred <- predict(fit.lm, newdata=data.frame(Time=Time))    

plot(temperature ~ Time, data= Data, xlim=c(1, 900))
lines(fit, col="red")
lines(Time, pred, col="blue")

This gives me:

在此输入图像描述

Which is probably what you were hoping for.

How about choosing an X and an Y while doing your line plot instead of just choosing the Y.

plot(time,predict(fit.nls),type="l", col="red", xlim=c(1, 900), pch=19, ann=FALSE, xaxt="n",
yaxt="n")

Also both lm and nls just give you the fitted points. So you must estimate the rest of the points in order to make a curve, a line plot. Since you are with nls and lm , perhaps the function predict maybe useful.

Not sure if this might help - I get a similar fit using sine only:

y = amplitude * sin(pi * (x - center) / width) + Offset

amplitude =  2.0009690806953033E+00
center = -2.5813588834888215E+01
width =  1.8077550471975817E+02
Offset =  2.6872265116104828E+01

Fitting target of lowest sum of squared absolute error = 3.6755174406241423E+01

Degrees of freedom (error): 90
Degrees of freedom (regression): 3
Chi-squared: 36.7551744062
R-squared: 0.816419142696
R-squared adjusted: 0.810299780786
Model F-statistic: 133.415731033
Model F-statistic p-value: 1.11022302463e-16
Model log-likelihood: -89.2464811027
AIC: 1.98396768304
BIC: 2.09219299292
Root Mean Squared Error (RMSE): 0.625309918107

amplitude = 2.0009690806953033E+00
       std err squared: 1.03828E-02
       t-stat: 1.96374E+01
       p-stat: 0.00000E+00
       95% confidence intervals: [1.79853E+00, 2.20340E+00]
center = -2.5813588834888215E+01
       std err squared: 2.98349E+01
       t-stat: -4.72592E+00
       p-stat: 8.41245E-06
       95% confidence intervals: [-3.66651E+01, -1.49621E+01]
width = 1.8077550471975817E+02
       std err squared: 3.54835E+00
       t-stat: 9.59680E+01
       p-stat: 0.00000E+00
       95% confidence intervals: [1.77033E+02, 1.84518E+02]
Offset = 2.6872265116104828E+01
       std err squared: 5.15458E-03
       t-stat: 3.74289E+02
       p-stat: 0.00000E+00
       95% confidence intervals: [2.67296E+01, 2.70149E+01]

Coefficient Covariance Matrix
[ 0.02542366 0.01786683 -0.05016085 -0.00652111]
[ 1.78668314e-02 7.30548346e+01 -2.18160818e+01 1.24965136e-01]
[ -5.01608451e-02 -2.18160818e+01 8.68860810e+00 -1.27401806e-02]
[-0.00652111 0.12496514 -0.01274018 0.0126217 ]

James Phillips zunzun@zunzun.com

Alternatively, you could have eliminated the NAs from your data after reading it in:

data <- subset(data, !is.na(temperature))

Then, when plotting, you could set the x-axis to the time points from the reduced data set:

plot(temp~time, data=data, xlim=c(1, 900))
lines(x=time, y=fit.lm$fitted, col="red")

This curve won't be as smooth as the one produced by @andy-barbour but it will work in a pinch.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM