I need to subset my data by a date range, below is the code.
I read in two .csv (data2010, data2), I changed the date format to exclude the timestamp, rename the headers so they are the same for both files, then merge(data2011).
The files seem to actually merge but when I subset by the date range, no observations are created.
However, the date is grouped like 01/01/10 01/01/11 01/02/10 01/02/11 = so same month/same day/different year pairing.
data2010 <- read.csv(file="2010final.csv")
data2 <- read.csv(file="2011final.csv")
#change format of timestamp to date with mm/dd/yyyy for 2011
data2$newdate <-strptime(as.character(data2$Date), "%m/%d/%y")
data2$Date <- format(data2$newdate, "%m/%d/%y")
data2$newdate <- NULL
#rename and format 2010
names(data2010) <- c("Region", "District", "Age", "Gender", "Marital Status", "Date", "Reason")
data2010$newdate <-strptime(as.character(data2010$Date), "%m/%d/%y %H")
data2010$Date <- format(data2010$newdate, "%m/%d/%y")
data2010$newdate <- NULL
#merge
data2011 <- rbind(data2010, data2)
summary(data2011)
str(data2011)
#I see from the above commands that the files have merged
jan6Before <- subset(data2011, Date >= "12/22/10" & Date <= "01/06/11")
summary(jan6Before)
str(jan6Before)
#But this does not produce any observations
I suspect it's because your Date
variable is a character, not date, being compared to another character constant ie "12/22/10".
I suggest you have a look at the package lubridate
. You can then easily convert character (in this case month-date-year) to compare, eg mdy(Date) >= mdy("12/22/10")
.
合并变量newDate
,并将其用于子集。
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.