简体   繁体   中英

Regex in if else statement in R

I have a rather simple question. I am trying to get the if else statement below to work.

It is supposed to assign '1' if the if statement is met, 0 otherwise. My problem is that I cannot get the regex in the if statement to work ('\\w*|\\W*). It is supposed to specify the condition that the string either is "Registration Required" or Registration required followed by any character. I cannot specify the exact cases, because following the "Registration required" (in the cases where something follows), it will usually be a date (varying for each observation) and a few words.

Registration_cleaned <- c()

for (i in 1:length(Registration)) {
  if (Registration[i] == ' Registration Required\\w*|\\W*') {
    Meta_Registration_cleaned <- 1
  } else {
    Meta_Registration_cleaned <- 0 
  }

 Registration_cleaned <- c(Registration_cleaned, Meta_Registration_cleaned)

}

You may use transform together with ifelse function to set the Meta_Registration_cleaned. For matching the regular expression grep function can be used with pattern "Registration Required\\w*".

Registration <- data.frame(reg = c("Registration Required", "Registration Required ddfdqf","some str", "Regixxstration Required ddfdqf"),stringsAsFactors = F)

transform(Registration,Meta_Registration_cleaned = ifelse(grepl("Registration Required\\w*",Registration[,"reg"]), 1, 0))

Gives result:

                      reg Meta_Registration_cleaned
1          Registration Required                         1
2   Registration Required ddfdqf                         1
3                       some str                         0
4 Regixxstration Required ddfdqf                         0

I might have misunderstood the OP completely, because I have understood the question entirely differently than anyone else here.

My comment earlier suggested looking for the regex at the end of the string.

Registration <- data.frame(reg = c("Registration Required", "Registration Required ddfdqf","Registration Required 10/12/2000"),stringsAsFactors = F)

#thanks @user1653941 for drafting the sample vector

Registration$Meta_Registration_cleaned <- grepl('Registration required$', Registration$reg, ignore.case = TRUE)

Registration

1            Registration Required                      TRUE
2     Registration Required ddfdqf                     FALSE
3 Registration Required 10/12/2000                     FALSE

I understand the OP as such that the condition is: Either the string "Registration required" without following characters, or... anything else. Looking forward to the OPs comment.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM