简体   繁体   中英

fuzzy match based on a list of patterns in R

I need to generate a dummy variable based on a list of patterns.

df <- data.frame(
  med = c("sivastatin", "sisvatatin", "rusvastatin", "yes", "no", "don't remember", "true", "false", "omega 3", "atorvastatin", "no")
)

I need to create a second dummy variable that indicates if the patient used or not any med. I tried this:

yes <- c("yes", "vastatin", "true", "don't remember")

nao <- c("no", "false") 

df$med_cat <- ifelse(agrepl(yes, df$med, ignore.case = TRUE), 1, 
                  ifelse(agrepl(no, df$med, ignore.case = TRUE), 0, NA)) 

But I'm getting an error saying that only the first element is being used

argument 'pattern' length > 1 and only the first element is going to used Error in $<-.data.frame ( *tmp* , med_cat, value = logical(0)): replacement has 0 rows, data has 8381

can someone help me with this?

SOLUTION:

df$med_cat <- ifelse(apply(sapply(yes, agrepl, df$med_cholstand, .1), 1, any), 1,
                             ifelse(apply(sapply(no, agrepl, df$med_cholstand, .1), 1, any), 0, NA))

Thank you all for your help!

This is the solution to the problem:


df$med_cat <- ifelse(apply(sapply(yes, agrepl, df$med_cholstand, .1), 1, any), 1,
                             ifelse(apply(sapply(no, agrepl, df$med_cholstand, .1), 1, any), 0, NA))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM