简体   繁体   English

在 R 中使用 Python 包(使用“reticulate”)

[英]Using Python Packages in R (with "reticulate")

I am trying to follow this tutorial over here : : https://hfshr.netlify.app/posts/2020-06-07-variable-inportance-with-fastshap/我正在尝试在此处遵循本教程:: https : //hfshr.netlify.app/posts/2020-06-07-variable-inportance-with-fastshap/

This tutorial is about using a machine learning algorithm called "SHAP" that attempts to provide the user with a method to interpret the results of complicated "blackbox" style algorithms.本教程是关于使用一种称为“SHAP”的机器学习算法,该算法试图为用户提供一种方法来解释复杂的“黑盒”风格算法的结果。

Following the tutorial, I was able to get everything to work - except the "force plots" at the end.按照教程,我能够让一切正常工作 - 除了最后的“力图”。 I have provided the code I am using at the bottom.我在底部提供了我正在使用的代码。 Could someone please help me in figuring out why these force plots are not working?有人可以帮我弄清楚为什么这些力图不起作用吗?

library(modeldata)
library(tidymodels)
library(tidyverse)
library(doParallel)
library(probably)
library(gt)

data("credit_data")

data("credit_data")

credit_data <- credit_data %>%
  drop_na()

set.seed(12)

# initial split
split <- initial_split(credit_data, prop = 0.75, strata = "Status")

# train/test sets
train <- training(split)
test <- testing(split)

rec <- recipe(Status ~ ., data = train) %>%
  step_bagimpute(Home, Marital, Job, Income, Assets, Debt) %>%
  step_dummy(Home, Marital, Records, Job, one_hot = T)

# Just some sensible values, not optimised by any means!
mod <- boost_tree(trees = 500,
                  mtry = 6,
                  min_n = 10,
                  tree_depth = 5) %>%
  set_engine("xgboost") %>%
  set_mode("classification")

xgboost_wflow <- workflow() %>%
  add_recipe(rec) %>%
  add_model(mod) %>%
  fit(train)

xg_res <- last_fit(xgboost_wflow,
                   split,
                   metrics = metric_set(roc_auc, pr_auc, accuracy))

preds <- xg_res %>%
  collect_predictions()

xg_res %>%
  collect_metrics()

library(vip)

# Get our model object
xg_mod <- pull_workflow_fit(xgboost_wflow)

vip(xg_mod$fit)

library(fastshap)

# Apply the preprocessing steps with prep and juice to the training data
X <- prep(rec, train) %>%
  juice() %>%
  select(-Status) %>%
  as.matrix()

# Compute shapley values
shap <- explain(xg_mod$fit, X = X, exact = TRUE)

# Create a dataframe of our training data
feat <- prep(rec, train) %>%
  juice()

autoplot(shap,
         type = "dependence",
         feature = "Amount",
         X = feat,
         smooth = TRUE,
         color_by = "Status")

predict(xgboost_wflow, train, type = "prob") %>%
  rownames_to_column("rowid") %>%
  filter(.pred_bad == min(.pred_bad) | .pred_bad == max(.pred_bad)) %>%
  gt()%>%
  fmt_number(columns = 2:3,
             decimals = 3)

library(patchwork)
p1 <- autoplot(shap, type = "contribution", row_num = 1541) +
  ggtitle("Likely bad")

p2 <- autoplot(shap, type = "contribution", row_num = 1806) +
  ggtitle("Likely good")

p1+p2

# here is the error (prior to running this code, I ran "pip install shap" in conda)

force_plot(object = shap[1541,],
           feature_values = X[1541,],
           display = "html",
           link = "logit")

Error in py_call_impl(callable, dots$args, dots$keywords) :
  TypeError: save_html() got an unexpected keyword argument 'plot_html'

Thank you谢谢

force_plot() is rather experimental, and just happened to work. force_plot()是相当实验性的,并且刚好起作用。 If you receive an error, make sure that you have the corresponding shap package (and its dependencies) installed.如果收到错误,请确保安装了相应的shap包(及其依赖项)。 In any case, you should report this issue on the fastshap GitHub repo: https://github.com/bgreenwell/fastshap/issues .在任何情况下,您都应该在fastshap GitHub 存储库上报告此问题: https : //github.com/bgreenwell/fastshap/issues

--BG --BG

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM