简体   繁体   English

R中if else语句中的正则表达式

[英]Regex in if else statement in R

I have a rather simple question. 我有一个相当简单的问题。 I am trying to get the if else statement below to work. 我试图使下面的if else语句起作用。

It is supposed to assign '1' if the if statement is met, 0 otherwise. 如果满足if语句,则应分配为“ 1”,否则为0。 My problem is that I cannot get the regex in the if statement to work ('\\w*|\\W*). 我的问题是我无法在if语句中使用正则表达式来工作('\\ w * | \\ W *)。 It is supposed to specify the condition that the string either is "Registration Required" or Registration required followed by any character. 应该指定条件,该字符串是“需要注册”或“必须注册”后跟任何字符。 I cannot specify the exact cases, because following the "Registration required" (in the cases where something follows), it will usually be a date (varying for each observation) and a few words. 我无法指定确切的案例,因为在“需要注册”之后(在某些情况下),它通常是一个日期(每次观察都不同)和几个单词。

Registration_cleaned <- c()

for (i in 1:length(Registration)) {
  if (Registration[i] == ' Registration Required\\w*|\\W*') {
    Meta_Registration_cleaned <- 1
  } else {
    Meta_Registration_cleaned <- 0 
  }

 Registration_cleaned <- c(Registration_cleaned, Meta_Registration_cleaned)

}

You may use transform together with ifelse function to set the Meta_Registration_cleaned. 您可以将transform和ifelse函数一起使用来设置Meta_Registration_cleaned。 For matching the regular expression grep function can be used with pattern "Registration Required\\w*". 为了匹配正则表达式,grep函数可以与模式“ Registration Required \\ w *”一起使用。

Registration <- data.frame(reg = c("Registration Required", "Registration Required ddfdqf","some str", "Regixxstration Required ddfdqf"),stringsAsFactors = F)

transform(Registration,Meta_Registration_cleaned = ifelse(grepl("Registration Required\\w*",Registration[,"reg"]), 1, 0))

Gives result: 给出结果:

                      reg Meta_Registration_cleaned
1          Registration Required                         1
2   Registration Required ddfdqf                         1
3                       some str                         0
4 Regixxstration Required ddfdqf                         0

I might have misunderstood the OP completely, because I have understood the question entirely differently than anyone else here. 我可能完全误解了OP,因为我对这个问题的理解与这里的其他人完全不同。

My comment earlier suggested looking for the regex at the end of the string. 我之前的评论建议在字符串末尾查找正则表达式。

Registration <- data.frame(reg = c("Registration Required", "Registration Required ddfdqf","Registration Required 10/12/2000"),stringsAsFactors = F)

#thanks @user1653941 for drafting the sample vector

Registration$Meta_Registration_cleaned <- grepl('Registration required$', Registration$reg, ignore.case = TRUE)

Registration

1            Registration Required                      TRUE
2     Registration Required ddfdqf                     FALSE
3 Registration Required 10/12/2000                     FALSE

I understand the OP as such that the condition is: Either the string "Registration required" without following characters, or... anything else. 我对OP的理解是这样的,条件是:要么是字符串“需要注册”,而后没有字符,要么是……。 Looking forward to the OPs comment. 期待OP的评论。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM