简体   繁体   English

在 R 中搜索以 X 开头的整个单词

[英]Grep for whole word that starts with X in R

I need to blank-out certain words in various phrases, but because the words may be conjugated, plural, or possessive, I can only look for the first few letters.我需要将各种短语中的某些词删掉,但由于这些词可能是共轭、复数或所有格,我只能查找前几个字母。 An example:一个例子:

example = "You are the elephant's friend."
gsub("\\beleph.*\\b", " _____ " , example)
[1] "You are the  _____ "

How can I match the entire word from the first few letters?如何从前几个字母匹配整个单词?

gsub("\\beleph[[:alpha:][:punct:]]+\\b", "_____" , example)
[1] "You are the _____ friend."

works in this instance.在这种情况下工作。

The change is replacing the greedy (and sometimes dangerous) ".*" matching anything and everything regex with the character class "[[:alpha:][:punct:]]+", which matches alphabetical characters and punctuation characters.变化是用字符类“[[:alpha:][:punct:]]+”替换贪婪的(有时是危险的)“.*”匹配任何东西和所有正则表达式,它匹配字母字符和标点符号。 See help(regex) for additional ready-made character classes that may be useful, like [:alnum:] in case any strings contain digits as well.请参阅help(regex)以获取可能有用的其他现成字符类,例如 [:alnum:] 以防任何字符串也包含数字。


In order to catch matches with the first word as well, the following should work.为了捕捉与第一个单词的匹配,以下应该起作用。 Here's an example.这是一个例子。

exampleYoda = "elephant's friend you be."

gsub("(\\b|^)eleph[[:alpha:][:punct:]]+\\b", "_____" , exampleYoda)
[1] "_____ friend you be."

which also works with example这也适用于示例

gsub("(\\b|^)eleph[[:alpha:][:punct:]]+\\b", "_____" , example)
[1] "You are the _____ friend."

To make your original code work you just have to make the quantifier ungreedy.为了让你的原始代码工作,你只需要让量词不贪婪。

example = "You are the elephant's friend."
gsub("\\beleph.*?\\b", " _____ " , example)
[1] "You are the  _____ 's friend."

This solution cause problems with the ' but you can use blank spaces insead, so you can try此解决方案会导致 ' 出现问题,但您可以在 insead 中使用空格,因此您可以尝试

example = "You are the elephant's friend."
gsub("\\seleph.*?\\s", " _____ " , example)
[1] "You are the _____ friend."

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM