简体   繁体   English

实现varImp以实现变量的重要性和绘图

[英]Implement varImp for variable importance and plot

在此处输入图片说明 I'm trying to plot the variable importance scores for the below model. 我正在尝试绘制以下模型的变量重要性得分。

The scores output fine but they're not plotting properly- do I need to add another parameter to the code? 分数输出很好,但是它们未正确绘制-我需要在代码中添加另一个参数吗?

Code and Output of Scores are below. 分数的代码和输出如下。

library(caret)
#GENERALISED LINEAR MODEL
LR_swim <- lm(racetime_mins ~ event_date+ event_month +year +event_id + 
            gender + distance_new + New_Condition+
            raceNo_Updated +  
            handicap_mins +points+
              Wind_Speed_knots+ 
             Air_Temp_Celsius +Water_Temp_Celsius +Wave_Height_m,
               data = SwimmingTrain) 
           family=gaussian(link = "identity")

varImp2<-varImp(object=LR_swim)
plot(varImp2,main="Variable Importance")

Overall event_date 24.463358 event_month 22.358448 year 24.399390 event_id 26.878342 genderfemale 30.422470 gendermale 13.273062 distance_new 248.727351 New_Condition 22.574999 raceNo_Updated 9.812053 handicap_mins 134.914137 points 40.443116 Wind_Speed_knots 14.492203 Air_Temp_Celsius 16.562194 Water_Temp_Celsius 2.861662 Wave_Height_m 8.592716 总体event_date 24.463358 event_month 22.358448年24.399390 event_id 26.878342 femalefemale 30.422470 femalemale 13.273062 distance_new 248.727351 New_Condition 22.574999 raceNo_Updated 9.812053 handicap_mins 134.914137 points 40.443116 Wind_Speed_knots 14.492203 Air_Temp_Cels 16.942203

#ClassOutput
class(varImp2)
[1] "data.frame"
#HeadOutput
> head(varImp2)
          Overall
event_date   24.46336
event_month  22.35845
year         24.39939
event_id     26.87834
genderfemale 30.42247
gendermale   13.27306

Mine looks like; 我的模样

Supposed to look like 应该看起来像

在此处输入图片说明

Based on your desired outcome, your goal is to plot a numeric column from a dataframe, ordered on the y-axis by the values in the column. 根据期望的结果,您的目标是从数据框中绘制数字列,并在y轴上按列中的值排序。 I'll use the mtcars dataset as an example. 我将以mtcars数据集为例。

library(caret)
LR_mtcars <- glm(mpg ~ ., data = mtcars, family = gaussian)
varImp2 <- varImp(LR_mtcars)

varImp2 is a dataframe. varImp2是一个数据帧。 Now add a column named "labels". 现在添加一个名为“标签”的列。 We'll make this column a factor , and then order it based on values from "Overall". 我们将使该列成为一个factor ,然后根据“总体”中的值对其进行排序。

varImp2$labels <- factor(rownames(varImp2))
varImp2$labels <- reorder(varImp2$labels, varImp2$Overall)

We can then plot the values. 然后我们可以绘制值。 For the first iteration of the plot we leave the titles for the x- and y- axis blank, as well as the labels for the y-axis. 对于图的第一次迭代,我们将x和y轴的标题留为空白,并将y轴的标签留为空白。 We then add those back in subsequently. 然后,我们随后将其添加回去。

plot(x = varImp2$Overall, y = varImp2$labels, main = "Variable Importance", 
  yaxt = "n", ylab = "", xlab = "")
axis(2, at = 1:nrow(varImp2), labels = levels(varImp2$labels), las = 2)
title(xlab = "Importance")

This gives us 这给了我们

在此处输入图片说明

Well, in the commands I have asked if the rownames of the varImp2 are the desired x values in your plot or not, but you did not tell. 好吧,在命令中我问过varImp2是否是varImp2中所需的x值,但您没有告诉。 In any case, assuming the rownames are the y values you want to assign, those codes give you the desired plot, you may arrange the x and y by yourself. 无论如何,假设行名是您要分配的y值,那么这些代码将为您提供所需的绘图,您可以自己排列x和y。

library(ggplot2)

  ggplot(data= varImp2, aes(x=rownames(varImp2),y=Overall)) +
  geom_bar(position="dodge",stat="identity",width = 0, color = "black") + 
  coord_flip() + geom_point(color='skyblue') + xlab(" Importance Score")+
  ggtitle("Variable Importance") + 
  theme(plot.title = element_text(hjust = 0.5)) +
  theme(panel.background = element_rect(fill = 'white', colour = 'black'))

在此处输入图片说明

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 线性 model 迭代拟合并使用 varImp() 计算迭代中所有预测变量的变量重要性 - Linear model fitting iteratively and calculate the Variable Importance with varImp() for all predictors over the iterations 如何使用插入符包 train() 和 varImp() 在 R 中显示逻辑回归的系数值和变量重要性 - How to show the coefficient values and variable importance for logistic regression in R using caret package train() and varImp() 使用插入符号 package (varImp) 计算具有分类变量的变量重要性时出错 - Error when calculating variable importance with categorical variables using the caret package (varImp) 为 varImp 创建绘图的问题 - Issues creating a plot for varImp Plot 关于机器学习变量重要性 - Plot about Machine Learning Variable Importance 在plot(varImp())中更改字体类型 - Changing the font type in plot(varImp()) 变量重要性 plot 在 R 中使用随机森林 package - Variable importance plot using randomforest package in R 通过varImp按重要性提取20多个变量 - extracting more than 20 variables by importance via varImp 使用具有重要性/具有因子变量的 varImp 函数的随机森林进行特征选择 - Feature selection using Random forest with importance / varImp functions with factor variables 随机森林的varImp(插入符号)和重要性(randomForest)之间的差异 - Difference between varImp (caret) and importance (randomForest) for Random Forest
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM