简体   繁体   English

grepl在R中:尽管单词内有破折号,但仍存在虚假匹配

[英]grepl in R: spurious match despite intra-word dash

below is a minimal reproducible example: 下面是一个最小的可复制示例:

v=c("\\<skill-saw\\>","\\<saw blade\\>")
text="xx placed his hand beneath skill-saw blade"
sapply(v,grepl,text)

The last command returns c(TRUE,TRUE) where I was expecting c(TRUE,FALSE). 最后一条命令返回c(TRUE,TRUE),而我期望的是c(TRUE,FALSE)。 Any idea on how to achieve that? 关于如何实现这一点的任何想法? The idea is that the keyword "skill-saw" should be detected as present in the text, but not the keyword "saw blade"... 这个想法是,应该检测到文本中存在关键字“ skill-saw”,而不是关键字“ saw blade”。

Thanks a lot in advance for your help! 在此先感谢您的帮助!

You can try regex lookbehind 您可以尝试regex

v <- c('(?<= )\\bskill-saw\\b', '(?<= )\\bsaw blade\\b')
 unname(sapply(v, grepl, text, perl=TRUE))
 #[1]  TRUE FALSE

Update 更新

Based on the new "text", may be 根据新的“文字”,可能是

text1 <- "xx placed his hand beneath skill saw-blade"

v <- c('(?<= )\\bskill-saw\\b', '(?<= )\\bsaw-?blade\\b')
unname(sapply(v, grepl, text1, perl=TRUE))
#[1] FALSE  TRUE

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM