[英]Apply logistic regression in a function in R
I want to run logistic regression for multiple parameters and store the different metrics ie AUC.我想对多个参数运行逻辑回归并存储不同的指标,即 AUC。 I wrote the function below but I get an error when I call it: Error in eval(predvars, data, env) : object 'X0' not found even if the variable exists in both my training and testing dataset.
我写了下面的函数,但是当我调用它时出现错误: eval(predvars, data, env) 中的错误:即使变量存在于我的训练和测试数据集中,也找不到对象'X0'。 Any idea?
任何想法?
new.function <- function(a) {
model = glm(extry~a,family=binomial("logit"),data = train_df)
pred.prob <- predict(model,test_df, type='response')
predictFull <- prediction(pred.prob, test_df$extry)
auc_ROCR <- performance(predictFull, measure = "auc")
my_list <- list("AUC" = auc_ROCR)
return(my_list)
}
# Call the function new.function supplying 6 as an argument.
les <- new.function(X0)
The main reason why your function didn't work is that you are trying to call an object into a formula.您的函数不起作用的主要原因是您试图将对象调用到公式中。 You can fix it with paste formula function, but that is ultimately quite limiting.
您可以使用粘贴公式功能修复它,但这最终是非常有限的。
I suggest instead that you consider using update
.我建议您考虑使用
update
。 This allow you more flexibility to change with multiple variable combination, or change a training dataset, without breaking the function.这使您可以更灵活地更改多个变量组合,或更改训练数据集,而不会破坏功能。
model = glm(extry~a,family=binomial("logit"),data = train_df)
new.model = update(model, .~X0)
new.function <- function(model){
pred.prob <- predict(model, test_df, type='response')
predictFull <- prediction(pred.prob, test_df$extry)
auc_ROCR <- performance(predictFull, measure = "auc")
my_list <- list("AUC" = auc_ROCR)
return(my_list)
}
les <- new.function(new.model)
The function can be further improved by calling the test_df
as a separate argument, so that you can fit it with an alternative testing data.可以通过将
test_df
作为单独的参数调用来进一步改进该函数,以便您可以将其与替代测试数据相匹配。
To run the function in the way you intended, you would need to use non-standard evaluation to capture the symbol and insert it in a formula.要以您想要的方式运行该函数,您需要使用非标准评估来捕获符号并将其插入公式中。 This can be done using
match.call
and as.formula
.这可以使用
match.call
和as.formula
来完成。 Here's a fully reproducible example using dummy data:这是一个使用虚拟数据的完全可重现的示例:
new.function <- function(a) {
# Convert symbol to character
a <- as.character(match.call()$a)
# Build formula from character strings
form <- as.formula(paste("extry", a, sep = "~"))
model <- glm(form, family = binomial("logit"), data = train_df)
pred.prob <- predict(model, test_df, type = 'response')
predictFull <- ROCR::prediction(pred.prob, test_df$extry)
auc_ROCR <- ROCR::performance(predictFull, "auc")
list("AUC" = auc_ROCR)
}
Now we can call the function in the way you intended:现在我们可以按照您想要的方式调用该函数:
new.function(X0)
#> $AUC
#> A performance instance
#> 'Area under the ROC curve'
new.function(X1)
#> $AUC
#> A performance instance
#> 'Area under the ROC curve'
If you want to see the actual area under the curve you would need to do:如果您想查看曲线下的实际面积,您需要执行以下操作:
new.function(X0)$AUC@y.values[[1]]
#> [1] 0.6599759
So you may wish to modify your function so that the list contains auc_ROCR@y.values[[1]]
rather than auc_ROCR
所以你可能希望修改你的函数,使列表包含
auc_ROCR@y.values[[1]]
而不是auc_ROCR
Data used使用的数据
set.seed(1)
train_df <- data.frame(X0 = sample(100), X1 = sample(100))
train_df$extry <- rbinom(100, 1, (train_df$X0 + train_df$X1)/200)
test_df <- data.frame(X0 = sample(100), X1 = sample(100))
test_df$extry <- rbinom(100, 1, (test_df$X0 + test_df$X1)/200)
Created on 2022-06-29 by the reprex package (v2.0.1)由reprex 包于 2022-06-29 创建 (v2.0.1)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.