努力根据模式删除单词（R 中的文本分析）

Question

I'm new to text analysis.我是文本分析的新手。 I have been struggling with a particular problem in R this past week.上周我一直在努力解决 R 中的一个特定问题。 I am trying to figure out how to remove or replace all variations of a word in a string.我想弄清楚如何删除或替换字符串中单词的所有变体。 For example, if the string is:例如，如果字符串是：

test <- c("development", "develop", "developing", "developer", "apples", "kiwi")

I want the end output to be:我希望最终输出是：

"apples", "kiwi"

So, basically, I'm trying to figure out how to remove or replace all words beginning with "^develop".所以，基本上，我试图弄清楚如何删除或替换所有以“^develop”开头的单词。 I have tried using str_remove_all in the stringr package using this expression:我曾尝试使用以下表达式在 stringr 包中使用 str_remove_all ：

str_remove_all(test, "^dev")

But the end result was this:但最终的结果是这样的：

"elopment", "elop", "eloping", "eloper", "apples", "kiwi"

It only removed parts of the word that matched the beginning expression "dev", whereas I want to remove the entire word if it matches the beginning of "dev".它只删除了与开头表达式“dev”匹配的部分单词，而如果它与“dev”的开头匹配，我想删除整个单词。

Thanks!谢谢！

Answer 1

过滤器（函数（x）！any（grepl（“开发”，x）），测试）

Answer 2

Use grep with invert:将 grep 与反转一起使用：

grep("^develop", test, invert = TRUE, value = TRUE)
## [1] "apples" "kiwi"

or negate grepl:或否定 grepl：

ok <- !grepl("^develop", test)
test[ok]

or remove develop and then retrieve those elements that have not changed:或者删除develop然后检索那些没有改变的元素：

test[sub("^develop", "", test) == test]

Answer 3

通过stringr ，您可以执行以下操作：

stringr::str_subset(test, "^dev", negate = TRUE)

努力根据模式删除单词（R 中的文本分析）

问题描述

3 个解决方案

解决方案1
1 2020-03-27 17:20:26

解决方案2
0 已采纳 2020-03-27 15:05:41

解决方案3
0 2020-03-27 15:12:00

努力根据模式删除单词（R 中的文本分析）

问题描述

3 个解决方案

解决方案1 1 2020-03-27 17:20:26

解决方案2 0 已采纳 2020-03-27 15:05:41

解决方案3 0 2020-03-27 15:12:00

解决方案1
1 2020-03-27 17:20:26

解决方案2
0 已采纳 2020-03-27 15:05:41

解决方案3
0 2020-03-27 15:12:00