简体   繁体   English

在 R 中的函数中应用逻辑回归

[英]Apply logistic regression in a function in R

I want to run logistic regression for multiple parameters and store the different metrics ie AUC.我想对多个参数运行逻辑回归并存储不同的指标,即 AUC。 I wrote the function below but I get an error when I call it: Error in eval(predvars, data, env) : object 'X0' not found even if the variable exists in both my training and testing dataset.我写了下面的函数,但是当我调用它时出现错误: eval(predvars, data, env) 中的错误:即使变量存在于我的训练和测试数据集中,也找不到对象'X0'。 Any idea?任何想法?

new.function <- function(a) {
  model = glm(extry~a,family=binomial("logit"),data = train_df)
  pred.prob <- predict(model,test_df, type='response')
  predictFull <- prediction(pred.prob, test_df$extry)
  auc_ROCR <- performance(predictFull, measure = "auc")

  my_list <- list("AUC" =  auc_ROCR)
  return(my_list) 
}

# Call the function new.function supplying 6 as an argument.
les <- new.function(X0)

The main reason why your function didn't work is that you are trying to call an object into a formula.您的函数不起作用的主要原因是您试图将对象调用到公式中。 You can fix it with paste formula function, but that is ultimately quite limiting.您可以使用粘贴公式功能修复它,但这最终是非常有限的。

I suggest instead that you consider using update .我建议您考虑使用update This allow you more flexibility to change with multiple variable combination, or change a training dataset, without breaking the function.这使您可以更灵活地更改多个变量组合,或更改训练数据集,而不会破坏功能。

model = glm(extry~a,family=binomial("logit"),data = train_df)
new.model = update(model, .~X0)


new.function <- function(model){
  pred.prob <- predict(model, test_df, type='response')
  predictFull <- prediction(pred.prob, test_df$extry)
  auc_ROCR <- performance(predictFull, measure = "auc")

  my_list <- list("AUC" =  auc_ROCR)
  return(my_list) 
}


les <- new.function(new.model)

The function can be further improved by calling the test_df as a separate argument, so that you can fit it with an alternative testing data.可以通过将test_df作为单独的参数调用来进一步改进该函数,以便您可以将其与替代测试数据相匹配。

To run the function in the way you intended, you would need to use non-standard evaluation to capture the symbol and insert it in a formula.要以您想要的方式运行该函数,您需要使用非标准评估来捕获符号并将其插入公式中。 This can be done using match.call and as.formula .这可以使用match.callas.formula来完成。 Here's a fully reproducible example using dummy data:这是一个使用虚拟数据的完全可重现的示例:

new.function <- function(a) {
  
  # Convert symbol to character
  a <- as.character(match.call()$a)
  
  # Build formula from character strings
  form <- as.formula(paste("extry", a, sep = "~"))
  
  model <- glm(form, family = binomial("logit"), data = train_df)
  pred.prob <- predict(model, test_df, type = 'response')
  predictFull <- ROCR::prediction(pred.prob, test_df$extry)
  auc_ROCR <- ROCR::performance(predictFull, "auc")

  list("AUC" =  auc_ROCR)
}

Now we can call the function in the way you intended:现在我们可以按照您想要的方式调用该函数:

new.function(X0)
#> $AUC
#> A performance instance
#>   'Area under the ROC curve'

new.function(X1)
#> $AUC
#> A performance instance
#>   'Area under the ROC curve'

If you want to see the actual area under the curve you would need to do:如果您想查看曲线下的实际面积,您需要执行以下操作:

new.function(X0)$AUC@y.values[[1]]
#> [1] 0.6599759

So you may wish to modify your function so that the list contains auc_ROCR@y.values[[1]] rather than auc_ROCR所以你可能希望修改你的函数,使列表包含auc_ROCR@y.values[[1]]而不是auc_ROCR


Data used使用的数据

set.seed(1)

train_df <- data.frame(X0 = sample(100), X1 = sample(100))
train_df$extry <- rbinom(100, 1, (train_df$X0 + train_df$X1)/200)

test_df  <- data.frame(X0 = sample(100), X1 = sample(100))
test_df$extry <- rbinom(100, 1, (test_df$X0 + test_df$X1)/200)

Created on 2022-06-29 by the reprex package (v2.0.1)reprex 包于 2022-06-29 创建 (v2.0.1)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM