简体   繁体   中英

Prediction error in R in case of new levels for variable

I'm using the GBM package for prediction in R. Traning works pretty well with a reasonable error rate, however, when wanted to run the prediction on a training set that contains factor variable with new levels then I got the following error:

gbm1 <- gbm(SalePrice ~., data=bb,distribution="gaussian",n.trees=7000,cv.folds=3,shrinkage=0.001,interaction.depth=4)

    f.predict <- exp(predict.gbm(gbm1,data.frame(bbv),n.trees=7000))
        Error in predict.gbm(gbm1, data.frame(bbv), n.trees = 7000) : 
          New levels for variable <and the name of the levels are listed>

Tried to search on the error text but only found the GBM code itself ;(

Any suggestion is appreciated!

I'm not familiar with the GBM package, but the error suggest that GBM cannot deal with predicting from a model when the prediction data contains a previously unknown level. The rationale behind it is that the model can only say something about the class of data that it was trained for. In the case of a simple linear model, you cannot expect the model a~b ( a depends on b ) to predict data which involves a new variable b, ie a~b+c . The model has no trained behavior for b+c , only for for b .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM