简体   繁体   中英

convert factor to date in R to create dummy variable

I need to create dummy variable for "before and after 04/11/2020" for variable "date" in dataset "counties". There are over hundred dates in the dataset. I am trying to covert the dates from factor to date with as.date function, but get NA. Could you please help finding where I am making an error? I kept the other dummy variable I created just in case, if it affects the overall outcome

counties <- read.csv('C:/Users/matpo/Desktop/us-counties.csv')
str(counties)
as.Date(counties$date, format = '%m/%d/%y')
#create dummy variables forNew York, New Jersey, California, and Illinois
counties$state = ifelse(counties$state == 'New Jersey' & 
               counties$state == 'New York'& counties$state == 'California' & 
               counties$state == 'Illinois', 1, 0)
counties$date = ifelse(counties$date >= "4/11/2020", 1, 0)

str output

 $ date  : logi  NA NA NA NA NA NA ...
 $ county: Factor w/ 1774 levels "Abbeville","Acadia",..: 1468 1468 1468 379 1468 1178 379 1468 979 942 ...
 $ state : num  0 0 0 0 0 0 0 0 0 0 ...
 $ fips  : int  53061 53061 53061 17031 53061 6059 17031 53061 4013 6037 ...
 $ cases : int  1 1 1 1 1 1 1 1 1 1 ...
 $ deaths: int  0 0 0 0 0 0 0 0 0 0 ...``

Thank you!

  1. You have an incorrect format in as.Date , you should use "%Y" for 4 digit year.

  2. You need to assign the values back ( <- ) for the values to change.

  3. "4/11/2020" is just a string, if you are comparing date you need to convert it to date object. Also you can avoid using ifelse here.

Try:

counties$date <- as.Date(counties$date, format = '%m/%d/%Y')
counties$dummy <- as.integer(counties$date >= as.Date('2020-04-11'))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM