[英]Grep for whole word that starts with X in R
I need to blank-out certain words in various phrases, but because the words may be conjugated, plural, or possessive, I can only look for the first few letters.我需要将各种短语中的某些词删掉,但由于这些词可能是共轭、复数或所有格,我只能查找前几个字母。 An example:
一个例子:
example = "You are the elephant's friend."
gsub("\\beleph.*\\b", " _____ " , example)
[1] "You are the _____ "
How can I match the entire word from the first few letters?如何从前几个字母匹配整个单词?
gsub("\\beleph[[:alpha:][:punct:]]+\\b", "_____" , example)
[1] "You are the _____ friend."
works in this instance.在这种情况下工作。
The change is replacing the greedy (and sometimes dangerous) ".*" matching anything and everything regex with the character class "[[:alpha:][:punct:]]+", which matches alphabetical characters and punctuation characters.变化是用字符类“[[:alpha:][:punct:]]+”替换贪婪的(有时是危险的)“.*”匹配任何东西和所有正则表达式,它匹配字母字符和标点符号。 See
help(regex)
for additional ready-made character classes that may be useful, like [:alnum:] in case any strings contain digits as well.请参阅
help(regex)
以获取可能有用的其他现成字符类,例如 [:alnum:] 以防任何字符串也包含数字。
In order to catch matches with the first word as well, the following should work.为了捕捉与第一个单词的匹配,以下应该起作用。 Here's an example.
这是一个例子。
exampleYoda = "elephant's friend you be."
gsub("(\\b|^)eleph[[:alpha:][:punct:]]+\\b", "_____" , exampleYoda)
[1] "_____ friend you be."
which also works with example这也适用于示例
gsub("(\\b|^)eleph[[:alpha:][:punct:]]+\\b", "_____" , example)
[1] "You are the _____ friend."
To make your original code work you just have to make the quantifier ungreedy.为了让你的原始代码工作,你只需要让量词不贪婪。
example = "You are the elephant's friend."
gsub("\\beleph.*?\\b", " _____ " , example)
[1] "You are the _____ 's friend."
This solution cause problems with the ' but you can use blank spaces insead, so you can try此解决方案会导致 ' 出现问题,但您可以在 insead 中使用空格,因此您可以尝试
example = "You are the elephant's friend."
gsub("\\seleph.*?\\s", " _____ " , example)
[1] "You are the _____ friend."
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.