从R中的先前数据帧复制因子

Question

我想将因子水平从现有的数据框中复制到新创建的数据框中，而不是手动分配水平。

为了使用“预测”功能，R要求新数据位于因数与模型训练数据相同的数据框中。 我想相信这些因素可以从训练数据复制到新的数据框架。 我已经做到这一点，如下面的代码所示，尽管很笨拙。

# Build the model
naive_model <- NaiveBayes(outcome ~ purpose_ + home_ + emp_len_, data = loan_data, na.action = na.omit)

# Create new data point to be tested
new_loan_frame <- data.frame(purpose_ = "small_business", home_ = "MORTGAGE", emp_len_ = "> 1 Year")

# Add the necessary factors to match the training data
new_loan_frame$purpose_ <- factor(new_loan_frame$purpose_, levels = c("credit_card","debt_consolidation", "home_improvement", "major_purchase", "medical","other","small_business"))
new_loan_frame$home_ <- factor(new_loan_frame$home_, levels = c("MORTGAGE", "OWN", "RENT"))
new_loan_frame$emp_len_ <- factor(new_loan_frame$emp_len_, levels = c("< 1 Year", "> 1 Year"))

# Run the prediction using the model and the new data
predict(naive_model, new_loan_frame)

写出每种输入类型的因素似乎比我预期的要繁重。 清理此事的最佳方法是什么？

Answer 1

您可以自动执行所有操作。

for(cn in colnames(loan_data)) {
  new_loan_frame[,cn] <- factor(new_loan_frame[,cn], levels=levels(loan_data[,cn]))
}

Answer 2

嗨，欢迎来到Stackoverflow。正确的是，为了进行预测，您必须在一个数据框中很好地组织数据。 请尝试以下方法：

new_loan_frame <- data.frame(purpose= rep(levels(loan_data$purpose),3), home = rep(levels(loan_data$home),each=7), emp_len=rep(levels(loan_data$emp_len)))

Preds1<-predict(naive_model , newdata=new_load_frame, level=0)

另外，尝试不要在级别名称中使用“ _”。 相反，您可以简单地使用： , sep="_")

祝好运

从R中的先前数据帧复制因子

问题描述

2 个解决方案

解决方案1
0 2019-07-18 07:24:17

解决方案2
-1 2019-07-18 07:02:44

从R中的先前数据帧复制因子

问题描述

2 个解决方案

解决方案1 0 2019-07-18 07:24:17

解决方案2 -1 2019-07-18 07:02:44

解决方案1
0 2019-07-18 07:24:17

解决方案2
-1 2019-07-18 07:02:44