[英]Implement varImp for variable importance and plot
I'm trying to plot the variable importance scores for the below model. 我正在尝试绘制以下模型的变量重要性得分。
The scores output fine but they're not plotting properly- do I need to add another parameter to the code? 分数输出很好,但是它们未正确绘制-我需要在代码中添加另一个参数吗?
Code and Output of Scores are below. 分数的代码和输出如下。
library(caret)
#GENERALISED LINEAR MODEL
LR_swim <- lm(racetime_mins ~ event_date+ event_month +year +event_id +
gender + distance_new + New_Condition+
raceNo_Updated +
handicap_mins +points+
Wind_Speed_knots+
Air_Temp_Celsius +Water_Temp_Celsius +Wave_Height_m,
data = SwimmingTrain)
family=gaussian(link = "identity")
varImp2<-varImp(object=LR_swim)
plot(varImp2,main="Variable Importance")
Overall event_date 24.463358 event_month 22.358448 year 24.399390 event_id 26.878342 genderfemale 30.422470 gendermale 13.273062 distance_new 248.727351 New_Condition 22.574999 raceNo_Updated 9.812053 handicap_mins 134.914137 points 40.443116 Wind_Speed_knots 14.492203 Air_Temp_Celsius 16.562194 Water_Temp_Celsius 2.861662 Wave_Height_m 8.592716 总体event_date 24.463358 event_month 22.358448年24.399390 event_id 26.878342 femalefemale 30.422470 femalemale 13.273062 distance_new 248.727351 New_Condition 22.574999 raceNo_Updated 9.812053 handicap_mins 134.914137 points 40.443116 Wind_Speed_knots 14.492203 Air_Temp_Cels 16.942203
#ClassOutput
class(varImp2)
[1] "data.frame"
#HeadOutput
> head(varImp2)
Overall
event_date 24.46336
event_month 22.35845
year 24.39939
event_id 26.87834
genderfemale 30.42247
gendermale 13.27306
Mine looks like; 我的模样
Supposed to look like 应该看起来像
Based on your desired outcome, your goal is to plot a numeric column from a dataframe, ordered on the y-axis by the values in the column. 根据期望的结果,您的目标是从数据框中绘制数字列,并在y轴上按列中的值排序。 I'll use the mtcars
dataset as an example. 我将以mtcars
数据集为例。
library(caret)
LR_mtcars <- glm(mpg ~ ., data = mtcars, family = gaussian)
varImp2 <- varImp(LR_mtcars)
varImp2
is a dataframe. varImp2
是一个数据帧。 Now add a column named "labels". 现在添加一个名为“标签”的列。 We'll make this column a factor
, and then order it based on values from "Overall". 我们将使该列成为一个factor
,然后根据“总体”中的值对其进行排序。
varImp2$labels <- factor(rownames(varImp2))
varImp2$labels <- reorder(varImp2$labels, varImp2$Overall)
We can then plot the values. 然后我们可以绘制值。 For the first iteration of the plot we leave the titles for the x- and y- axis blank, as well as the labels for the y-axis. 对于图的第一次迭代,我们将x和y轴的标题留为空白,并将y轴的标签留为空白。 We then add those back in subsequently. 然后,我们随后将其添加回去。
plot(x = varImp2$Overall, y = varImp2$labels, main = "Variable Importance",
yaxt = "n", ylab = "", xlab = "")
axis(2, at = 1:nrow(varImp2), labels = levels(varImp2$labels), las = 2)
title(xlab = "Importance")
This gives us 这给了我们
Well, in the commands I have asked if the rownames of the varImp2
are the desired x values in your plot or not, but you did not tell. 好吧,在命令中我问过varImp2
是否是varImp2
中所需的x值,但您没有告诉。 In any case, assuming the rownames are the y values you want to assign, those codes give you the desired plot, you may arrange the x and y by yourself. 无论如何,假设行名是您要分配的y值,那么这些代码将为您提供所需的绘图,您可以自己排列x和y。
library(ggplot2)
ggplot(data= varImp2, aes(x=rownames(varImp2),y=Overall)) +
geom_bar(position="dodge",stat="identity",width = 0, color = "black") +
coord_flip() + geom_point(color='skyblue') + xlab(" Importance Score")+
ggtitle("Variable Importance") +
theme(plot.title = element_text(hjust = 0.5)) +
theme(panel.background = element_rect(fill = 'white', colour = 'black'))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.