简体   繁体   中英

Regression model to predict student's grade in R

Please I need your help!

I have data for 2017 with the folowing variables:

Age : Numeric

Gender : Gender Value M=Male, F=Female, X=Indeterminate/Intersex/Unspecified

Postal Postcode : Numeric Code

Residential postcode : 1 = Major Cities, 2 = Inner Regional,3 = Outer Regional, 4 = Remote and 5 = Very Remote Socio-Economic: *0-99 where 0 is low Socio-Economic and 99 is high *

School Code : Numeric Code

Educational attainment of first parent : Numeric

Educational attainment of second parent : Numeric

Grade : Numeric between 0 and 100

I would like to training on 2017 data to predict student's grade in 2018 (for example, if we have a student got grade 80 and in 2018 we have a student with the same variables or very similar so the predicted grade should something close to 80)

////////////////////////////////////////////////////////////////////////////////

Thank you, vitalious! I have used your script and I got the results! Here's the script I used and the data:

data<-read.csv("Olddata.csv")
newdata<-read.csv("Newdata.csv")

model <- lm(Age~., data=data)
nextYear <- data
nextYear$Age <- nextYear$Age + 1
results <- predict(model, newdata=nextYear, type='response')

Assume that we have only the following variables:

Age Gender Postal.Postcode Grade 20 F 3191 89.6 20 M 3930 99 20 F 3126 99.2 21 M 3910 94.65

And the newdata could be anything with the same number of variables.

The output was something like: 1 2 3 4
20.09547 20.48317 19.82224 20.55038

But actually, the output I want is the actual grade for each student out of 100!

What you're looking for is a linear regression model. In R, it's invoked with lm() . You can read more here . You'd want to fit a model predicting the grade, and then run the model on the data with the Age incremented by one, since presumably, that is the only attribute that will be changing next year.

Assuming your data is in a dataframe called data, it would look something like this:

model <- lm(Age~., data=data)

nextYear <- data
nextYear$Age <- nextYear$Age + 1
results <- predict(model, newdata=nextYear, type='response')

Make sure that all non-numeric columns are factors.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM