简体   繁体   English

如何在新数据集中将因子正确转换为数值

[英]How to correctly convert factor to numeric in a new dataset

I convert factor to numeric in my dataset as below 我将数据集中的因子转换为数值,如下所示

library(dplyr) 
df = data.frame(level= c( 'low', 'medium', 'high', 'very high'))

df$level = as.numeric(revalue(df$level, c('low' = 1, 'medium' =2, 'high'= 3, 'very high'=4)))
df

It's ok. 没关系。 The problem arises when I try to apply this rule for new dataset (I traned the model & and want to predict a new data) 当我尝试将此规则应用于新数据集时出现问题(我对模型进行了转换并希望预测新数据)

newdude = data.frame(level = c( 'high'))
newdude$level = as.numeric(revalue(newdude$level, c('low' = 1, 'medium' =2, 'high'= 3, 'very high'=4)))
Error
The following `from` values were not present in `x`: low, medium, very high 
> newdude
  level
1     1

I get '1' instead of '3' I can not make for ample 我得到的不是1而不是3

newdude$level = as.numeric(revalue(newdude$level, c( 'high'= 3)))

because I can not know in advance what value it will take 因为我无法预先知道它将获得什么价值

How to fix it? 如何解决?

Try instead 试试吧

newdude = data.frame(level = factor('high', levels = c('low', 'medium', 'high', 'very high')))

newdude$level
[1] high
Levels: low medium high very high
as.numeric(newdude$level)
[1] 3

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM