繁体   English   中英

使用 lapply() function 执行逻辑回归时出错(选择了未定义的列)

[英]Error while using lapply() function to perform logistic regression (undefined columns selected)

所以这里取到的数据https://archive.ics.uci.edu/ml/machine-learning-databases/00497/divorce.rar

当我运行代码执行逻辑回归时,它显示错误。 但它在其他 R 程序上完美运行。 有什么我错过的吗?

set.seed(123)

离婚 = read.csv("C://Users//User//Documents//Y2S3//Predictive Modelling//divorce//divorce.csv")

昏暗(离婚)

结果:[1] 170 1

摘要(离婚)

结果:Atr1.Atr2.Atr3.Atr4.Atr5.Atr6.Atr7.Atr8.Atr9.Atr10.Atr11.Atr12.Atr13.Atr14.Atr15.Atr16.Atr17.Atr18.Atr19.Atr20.Atr21.Atr22.Atr23.Atr24。 Atr25.Atr26.Atr27.Atr28.Atr29.Atr30.Atr31.Atr32.Atr33.Atr34.Atr35.Atr36.Atr37.Atr38.Atr39.Atr40.Atr41.Atr42.Atr43.Atr44.Atr45.Atr46.Atr47.Atr48.Atr49。 Atr50.Atr51.Atr52.Atr53.Atr54.Class

长度:170

Class:字符

模式:字符

colnames(离婚)

结果:[1]“Atr1.Atr2.Atr3.Atr4.Atr5.Atr6.Atr7.Atr8.Atr9.Atr10.Atr11.Atr12.Atr13.Atr14.Atr15.Atr16.Atr17.Atr18.Atr19.Atr20.Atr21.Atr22。 Atr23.Atr24.Atr25.Atr26.Atr27.Atr28.Atr29.Atr30.Atr31.Atr32.Atr33.Atr34.Atr35.Atr36.Atr37.Atr38.Atr39.Atr40.Atr41.Atr42.Atr43.Atr44.Atr45.Atr46.Atr47。 Atr48.Atr49.Atr50.Atr51.Atr52.Atr53.Atr54.Class"

sapply(离婚,班级)

结果:Atr1.Atr2.Atr3.Atr4.Atr5.Atr6.Atr7.Atr8.Atr9.Atr10.Atr11.Atr12.Atr13.Atr14.Atr15.Atr16.Atr17.Atr18.Atr19.Atr20.Atr21.Atr22.Atr23.Atr24。 Atr25.Atr26.Atr27.Atr28.Atr29.Atr30.Atr31.Atr32.Atr33.Atr34.Atr35.Atr36.Atr37.Atr38.Atr39.Atr40.Atr41.Atr42.Atr43.Atr44.Atr45.Atr46.Atr47.Atr48.Atr49。 Atr50.Atr51.Atr52.Atr53.Atr54.Class“字符”

col_fac = c("Atr1","Atr2","Atr3","Atr4","Atr5","Atr6","Atr7","Atr8","Atr9","Atr10", +"Atr11"," Atr12","Atr13","Atr14","Atr15","Atr16","Atr17","Atr18","Atr19","Atr20",+"Atr21","Atr22","Atr23","Atr24 ","Atr25","Atr26","Atr27","Atr28","Atr29","Atr30", +"Atr31","Atr32","Atr33","Atr34","Atr35","Atr36" ,"Atr37","Atr38","Atr39","Atr40", +"Atr41","Atr42","Atr43","Atr44","Atr45","Atr46","Atr47","Atr48", "Atr49","Atr50", +"Atr51","Atr52","Atr53","Atr54","类")

离婚[col_fac] = lapply(离婚[col_fac],因素)

结果: [.data.frame (离婚,col_fac)中的错误:选择了未定义的列)

唯一的问题是您读取了一个由“;”分隔的文件并不是 ”,”。 sep = ";" 将解决问题。

# downloaded and extracted from https://archive.ics.uci.edu/ml/machine-learning-databases/00497/divorce.rar
divorce <- read.csv("./divorce.csv", sep = ";")
dim(divorce)
summary(divorce)
colnames(divorce)
sapply(divorce,class)
col_fac = c("Atr1","Atr2","Atr3","Atr4","Atr5","Atr6","Atr7","Atr8","Atr9","Atr10",
            "Atr11","Atr12","Atr13","Atr14","Atr15","Atr16","Atr17","Atr18","Atr19","Atr20",
            "Atr21","Atr22","Atr23","Atr24","Atr25","Atr26","Atr27","Atr28","Atr29","Atr30", 
            "Atr31","Atr32","Atr33","Atr34","Atr35","Atr36","Atr37","Atr38","Atr39","Atr40", 
            "Atr41","Atr42","Atr43","Atr44","Atr45","Atr46","Atr47","Atr48","Atr49","Atr50", 
            "Atr51","Atr52","Atr53","Atr54","Class")

divorce[col_fac] = lapply(divorce[col_fac],factor)

使用 dplyr 的不易出错的版本

以下将通过在 function is.numeric返回TRUE across那些变量where应用 function as.factormutate您的数据集。 请注意,在acrosswhere内传递的函数没有通常的括号。

library(dplyr)
divorce <- read.csv("./divorce.csv", sep = ";") %>% 
  mutate(across(where(is.numeric), as.factor))
glimpse(divorce)

有关 mutate cross 的详细信息,请在 R 控制台中键入?across

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM