[英]How do I create shap plot in R for GBM model?
我想为 GBM 模型创建一个特征重要性的形状图:
ctrlCV = trainControl(method = 'repeatedcv', repeats = 5 , number = 10 , classProbs = TRUE , savePredictions = TRUE, summaryFunction = twoClassSummary )
gbmFit = train(CR~., data = training_set,
method = "gbm",
metric="ROC",
trControl = ctrlCV,
tuneGrid = gbmGRID,
verbose = FALSE)
但是,我找到的所有示例都是针对 xgboost 模型的,像 SHAPforxgboost 和 shapr 这样的包对我不起作用。 例如:
shap_values <- shap.values(xgb_model = gbm_fit, X_train = tarining_set)
产生和错误:
error in `colnames<-`(`*tmp*`, value = c(colnames(x_train), "bias")) : attempt to set 'colnames' on an object with less than two dimensions
我需要这样的情节:
我怎样才能做到这一点?
structure(list(CR = c("nonComplete", "nonComplete", "nonComplete",
"nonComplete", "nonComplete", "nonComplete", "nonComplete", "nonComplete",
"nonComplete", "nonComplete"), gender = c(1, 0, 0, 0, 1, 0, 0,
1, 0, 1), CD4.T.cells = c(-0.0741098696855045, -0.094401270881699,
0.0410284948786532, -0.163302950330185, -0.0942478217207681,
-0.167314411991775, -0.118272811489486, -0.0366277340916379,
-0.0809646843667242, -0.140727850456348), CD8.T.cells = c(-0.178835447722468,
-0.253897294559596, -0.0372301980787381, -0.230579110769457,
-0.224125346052727, -0.196933050675633, -0.344608041139497, -0.0550538743643369,
-0.276178546845023, -0.235047665605314), T.helpers = c(-0.0384421660291032,
-0.0275306107582565, 0.186447606591857, -0.124972070102036, -0.15348122673842,
-0.106812144494277, -0.104757782473888, 0.0686746776877563, -0.0729755869081981,
-0.0783448555726869), NK.cells = c(-0.0924083910597563, -0.172356328661097,
-0.0172673823614314, 0.0280649471541352, -0.128925304635747,
-0.0875076743713435, -0.188649323737844, -0.0518877213975413,
-0.184546079512101, -0.100562282085102), Monocytes = c(-0.0680848706469295,
-0.173427291586957, -0.0106773958944477, -0.0015805672257001,
-0.0751114943036091, -0.0737177243152751, -0.211297995211542,
-0.0674023045286274, -0.149380203815874, -0.0352058106388986),
Neutrophils = c(-0.0391833488213571, -0.0275279418713283,
0.0156454755097513, 0.0285160860867748, -0.0633367938488132,
0.0252778805872529, -0.0827920017974784, 0.0432343965225797,
-0.0693846217599099, -0.0249227307025501), gd.T.Cells = c(-0.162246594987039,
-0.297759223265742, -0.0814825699645205, -0.0688779846190755,
-0.222281334925374, -0.264420103679214, -0.251924422671008,
-0.162709306032616, -0.292342418053931, -0.246818199922858
), Non.plasma.B.cells = c(-0.0384755654971015, -0.114370815587458,
0.161268251261644, -0.0571463865006043, -0.112851511342984,
-0.0822058328898433, -0.118367014322845, 0.114155959200915,
-0.0923514068231641, -0.115614038543851)), row.names = c("Pt1",
"Pt10", "Pt101", "Pt103", "Pt106", "Pt11", "Pt17", "Pt18", "Pt26",
"Pt27"), class = "data.frame")
我以前遇到过这个问题,对我来说它只适用于 xgboost 模型。 这应该对你有用,使用shapviz
包:
library(shapviz)
shp = shapviz(model, X_pred = data.matrix(data[,-1]), X = data)
sv_waterfall(shp, row_id = 1)
sv_importance(shp, kind = 'beeswarm')
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.