简体   繁体   中英

How to perform least squares regression in R given training and testing data with class labels?

I have a 63*62 training set and the class labels are also present. The test data is a 25*62 dimensions and has the class labels too. Given this how would I perform least squares regression? I am using the code:

res = lm(height~age)

what does height and age correspond to? When I have 61 features + 1 class (making it 62 columns for the training data) how would I input parameters?

Also how do I apply the model on the testing data?

If you have 62 columns you may want to use the more general formula

res = lm(height ~ . , data = mydata)

Notice how the period '.' represent the rest of the variables. But the previous answer is completely right in the sense that there are more variables than observations and therefore the answer (if there's any which shouldn't be) is completely useless.

height and age would be simply the labels of columns in your data frame. height is a predicted variable. You can have as many variables there as you wish: res = lm(height~age+wight+gender)

However, I must say that the question seems a bit strange to me because if you are performing a regression with 62 variables having 62 points in training set it will simply mean that you will always have an exact solution. Training set should always be (significantly) larger than the number of variables used.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM