简体   繁体   中英

Subsetting data table in R by date

I need to subset my data by a date range, below is the code.

I read in two .csv (data2010, data2), I changed the date format to exclude the timestamp, rename the headers so they are the same for both files, then merge(data2011).

The files seem to actually merge but when I subset by the date range, no observations are created.

However, the date is grouped like 01/01/10 01/01/11 01/02/10 01/02/11 = so same month/same day/different year pairing.

data2010 <- read.csv(file="2010final.csv")
data2 <- read.csv(file="2011final.csv") 


#change format of timestamp to date with mm/dd/yyyy for 2011
data2$newdate <-strptime(as.character(data2$Date), "%m/%d/%y")
data2$Date <- format(data2$newdate, "%m/%d/%y")
data2$newdate <- NULL

#rename and format 2010
names(data2010) <- c("Region", "District", "Age", "Gender", "Marital Status", "Date", "Reason")
data2010$newdate <-strptime(as.character(data2010$Date), "%m/%d/%y %H")
data2010$Date <- format(data2010$newdate, "%m/%d/%y")
data2010$newdate <- NULL

#merge
data2011 <- rbind(data2010, data2)

summary(data2011)
str(data2011) 
#I see from the above commands that the files have merged 

jan6Before <- subset(data2011, Date >= "12/22/10" & Date <= "01/06/11") 
summary(jan6Before)
str(jan6Before)
#But this does not produce any observations

I suspect it's because your Date variable is a character, not date, being compared to another character constant ie "12/22/10".

I suggest you have a look at the package lubridate . You can then easily convert character (in this case month-date-year) to compare, eg mdy(Date) >= mdy("12/22/10") .

合并变量newDate ,并将其用于子集。

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM