简体   繁体   中英

Warning message: the condition has length > 1 and only the first element will be used

Below are my codes in R. I use dplyr package to arrange the data by ID and Date and I try to create new column SD by using mutate(). In column SD, there are a few criteria for the outcome in column SD so I used if() and if else() functions but there are warning messages.

library(dplyr)

ID<-c("A01","A02","A03","A01","A01","A03","A02")
SA<-c(50,100,50,100,150,100,20)
a<-c("01/01/2012","01/01/2011","01/01/2012","01/01/2011","01/01/2013","01/01/2013","01/01/2012")
Date<-as.Date(a, format = "%d/%m/%Y")
df <- data.frame(ID,Date,SA)

start_date = as.Date("01/01/2012", format = "%d/%m/%Y")
end_date = as.Date("31/03/2012", format = "%d/%m/%Y")

df %>% 
  arrange(ID,Date) %>% 
  group_by(ID) %>% 
  mutate(start_date=start_date,
         end_date=end_date,
         period=as.numeric(end_date - start_date + 1),
         SD = if(Date <= start_date & Date + 365 >= end_date) {1} 
              else if(Date + 365 <= start_date | Date >= end_date) {0}
              else if(Date <= start_date & Date + 365 <= end_date) {(Date + 365 - start_date + 1)/period}
              else if(Date >= start_date & Date + 365 >= end_date) {(end_date - Date + 1)/period})

However, there are warning messages as below. How do I solve this?

"Warning messages:
1: In if (Date <= start_date & Date + 365 >= end_date) { :
  the condition has length > 1 and only the first element will be used
2: In if (Date + 365 <= start_date | Date >= end_date) { :
  the condition has length > 1 and only the first element will be used
3: In if (Date <= start_date & Date + 365 >= end_date) { :
  the condition has length > 1 and only the first element will be used
4: In if (Date + 365 <= start_date | Date >= end_date) { :
  the condition has length > 1 and only the first element will be used
5: In if (Date <= start_date & Date + 365 >= end_date) { :
  the condition has length > 1 and only the first element will be used"

This is a solution with ifelse

df %>% 
  arrange(ID,Date) %>% 
  group_by(ID) %>% 
  mutate(start_date=start_date,
         end_date=end_date,
         period=as.numeric(end_date - start_date + 1),
         SD = ifelse(Date <= start_date & Date + 365 >= end_date,
                     1, 
                     ifelse(Date + 365 <= start_date | Date >= end_date,
                            0, 
                            ifelse(Date <= start_date & Date + 365 <= end_date,
                                   (Date + 365 - start_date + 1)/period,
                                   (end_date - Date + 1)/period)))
  )

ifelse has 3 entries, the condition, what happens at condition==TRUE and what happens at condition==FALSE. You can chain ifelse commands to check for multiple conditions as i have done here.

case_when might be the more readable option though.

solution with case_when (assuming that startdate is min of date and end date is the max of date)

df %>% 
  arrange(ID,Date) %>% 
  group_by(ID) %>% 
  mutate(start_date=min(Date),
         end_date=max(Date),
         period= as.numeric(end_date - start_date + 1) ,
         SD = case_when(Date <= start_date & Date + 365 >= end_date ~ 1 ,
                        Date + 365 <= start_date | Date >= end_date ~0,
                        Date <= start_date & Date + 365 <= end_date ~ as.numeric((Date + 365 - start_date + 1)/period),
                        Date >= start_date & Date + 365 >= end_date~ as.numeric((end_date - Date + 1)/period)))

Note: you are missing the case where Date > start_date and Date < end_date. Preferably you add a TRUE ~ statement to the case_when, this will handle such cases.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM