I Have a dataset that contains 79 explanatory variables, In which 43 of them are factor.
Some of the factor variables are just generic labels - For those I intend to use dummy variables for numeric representation.
Some other subset of the factor variables contain ordered levels, for example:
BsmtQual: Evaluates the height of the basement
Ex Excellent (100+ inches)
Gd Good (90-99 inches)
TA Typical (80-89 inches)
Fa Fair (70-79 inches)
Po Poor (<70 inches
NA No Basement
I want to convert such factor variable to a numeric value that will preserve the order of the levels from the lowest to the highest, meaning that after the operation I want to get something like:
BsmtQual: Evaluates the height of the basement
Ex records will be replaced with: 6
Gd records will be replaced with: 5
TA records will be replaced with: 4
Fa records will be replaced with: 3
Po records will be replaced with: 2
NA records will be replaced with: 1
(Note sure If I can replace NA with 0 - As NA doesn't actually refers to missing data for this variable, but refers to a record with a low basement score)
How to code this replacement?
req_var$ExterQual <- revalue(req_var$ExterQual, c("Ex"=5 ,"Gd"=4 , "TA"=3 , "Fa"=2 ,"Po"=1))
Here I will not conside NA in these dataset. If you want to give number NA to 0 then add "NA"=0 in the above command.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.