[英]Regular expression negation in R
I have a problem trying to find a way to implement negation in R regular expressions.我在尝试找到一种在 R 正则表达式中实现否定的方法时遇到了问题。
my_strings <- c("a non-rheumatic fever", "a nonrheumatic fever", "a rheumatic fever", "a not rheumatic fever")
my_strings
## [1] "a non-rheumatic fever" "a nonrheumatic fever" "a rheumatic fever" "a not rheumatic fever"
In the above string, I'm trying to find a regular expression that will output just the following:在上面的字符串中,我试图找到一个仅输出以下内容的正则表达式:
## [1] "a rheumatic fever"
I've tried the following but I can't figure out how to negate the presence of "no(n|t)(\\\\s+|-)?"
我尝试了以下操作,但我不知道如何否定"no(n|t)(\\\\s+|-)?"
immediately preceding "rheumatic"
:紧接在"rheumatic"
:
t_inc <- "\\b([^n][^o][^nt](\\s+|-)?(rheumatic))\\b"
grep(t_inc, my_strings, ignore.case = T, perl = T, value = T)
## character(0)
t_inc <- "\\b([^(no(n|t))](\\s+|-)?(rheumatic))\\b"
grep(t_inc, my_strings, ignore.case = T, perl = T, value = T)
## character(0)
Please could someone give me some pointers?请有人能给我一些指点吗?
May be we can modify the syntax to a bit simpler one by making ue of invert
as mentioned by @IceCreamToucan in the comments也许我们可以通过@IceCreamToucan 在评论中提到的使 ue of invert
将语法修改为更简单的语法
grep("no[nt][- ]?rheumatic", my_strings, invert = TRUE, value = TRUE)
#[1] "a rheumatic fever"
the pattern matches 'no', followed by either letter 'n' or t', followed by a - or space if present and the word 'rheumatic'.该模式匹配“no”,后跟字母“n”或“t”,后跟“-”或空格(如果存在)和“风湿病”一词。 With invert= TRUE
, it will return all those matches that are not matching with the pattern使用invert= TRUE
,它将返回所有与模式不匹配的匹配项
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.