So I would like to create a linear regression model, with rocket price (written as rocket) against the data of launch (datum). I believe I can do this by doing: lm(Y ~ X). However, how would I be able to convert the prices from chr to num, and likewise for the dates?
Thank you!
Data:https://www.kaggle.com/agirlcoding/all-space-missions-from-1957
Effectively you are asking 3 different but very basic questions, which would be better learned by reading an introductory text than by posting a question on Stack Overflow.
Rocket
column? Depending on what version of R you are using, the column spaceData$Rocket
will be either a character vector or a factor vector. To cover both eventualities, you can do:
spaceData$Rocket <- as.numeric(as.character(spaceData$Rocket))
This will give you a warning that some NA
values were produced. That's OK - there are some blank cells in the column, so you want these to be NA
.
spaceData$Datum
from text to actual date times? In this case, you can use strptime
, and specify how the date string is formatted. We will also wrap this in as.POSIXct
to ensure that the data is formatted in a way that is easier to plot:
spaceData$Datum <- as.POSIXct(strptime(spaceData$Datum, "%a %b %d, %Y %H:%M"))
Before you attempt a linear regression, it is a good idea to make sure it is sensible to do a linear regression. For a linear regression to make sense, you should know that there is an approximately linear relationship between the two variables, and that the residuals are approximately normally distributed. An easy way to examine these assumptions is to plot the two variables:
plot(spaceData$Datum, spaceData$Rocket)
You don't need to be a statistician to see that any straight line through these points is going to be pretty hopeless as a description of the relationship. If we try it, we can see that:
abline(lm(Rocket ~ Datum, data = spaceData), col = "red")
So, by running a linear regression on this data, we can predict that the price of rockets will fall to zero on the 13th May 2036. Clearly this is nonsense.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.