I am very new to R
I have the following dataset
age sex bmi children smoker region charges sex_N
1 19 female 27.900 0 yes southwest 16884.924 female
2 18 male 33.770 1 no southeast 1725.552 male
3 28 male 33.000 3 no southeast 4449.462 male
4 33 male 22.705 0 no northwest 21984.471 male
5 32 male 28.880 0 no northwest 3866.855 male
6 31 female 25.740 0 no southeast 3756.622 female
I want to predict charges based on the other columns however other columns are categorical how do I change them to numeric variables?
I tried doing costs$sex_N <- as.factor(costs$sex)
but that did not give me the correct column as you can see above?
also, if columns which has unique values greater than 2, how to convert them? please help!
Here are two base R options that may help
> transform(
+ costs,
+ sex_N = as.integer(as.factor(sex_N))
+ )
age sex bmi children smoker region charges sex_N
1 19 female 27.900 0 yes southwest 16884.924 1
2 18 male 33.770 1 no southeast 1725.552 2
3 28 male 33.000 3 no southeast 4449.462 2
4 33 male 22.705 0 no northwest 21984.471 2
5 32 male 28.880 0 no northwest 3866.855 2
6 31 female 25.740 0 no southeast 3756.622 1
or
> transform(
+ costs,
+ sex_N = match(sex_N, sex_N)
+ )
age sex bmi children smoker region charges sex_N
1 19 female 27.900 0 yes southwest 16884.924 1
2 18 male 33.770 1 no southeast 1725.552 2
3 28 male 33.000 3 no southeast 4449.462 2
4 33 male 22.705 0 no northwest 21984.471 2
5 32 male 28.880 0 no northwest 3866.855 2
6 31 female 25.740 0 no southeast 3756.622 1
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.