r：遍历几列的所有元素以检测短语

Question

我正在尝试遍历包含文本文件的数据框中的几列。

我想检查第 7 列到第 16 列的每个条目，以查看是否有任何文本文件包含某个短语。

每次检测到短语时，我想将它出现的次数增加 1。

这看起来很简单。 我想我应该遍历列和行，但我似乎无法确切地弄清楚如何做到这一点。

有什么建议？ 提前感谢您的任何见解。

fc_count <- 0

for (col in profiles[7:16]){
  for (row in 1:nrow(profiles)){

    if(isTRUE(grepl("my name is jeff", row)) == TRUE){

      fc_count = fc_count + 1

    }

  }

}

fc_count

Answer 1

我们可以使用lapply循环第 7 到 16 列，应用grepl ，使用pattern来获取逻辑向量list ， Reduce ，通过添加 ( + ) 将其转换为单个整数向量，然后通过sum获得sum

sum(Reduce(`+`, lapply(profiles[7:16], grepl, pattern = "my name is jeff")))

由于grepl被vector grepl vector ，如果我们将 'data.frame' 转换为matrix （ matrix是具有暗淡属性的向量），它会更紧凑

sum(grepl("my name is jeff", as.matrix(profiles[7:16])))

此外，对于for循环，我们不需要嵌套循环，因为grepl是矢量化的

fc_count <- 0
for(prf in profiles[7:16]){
    fc_count <- fc_count + sum(grepl("my name is jeff", prf))
 }

r：遍历几列的所有元素以检测短语

问题描述

1 个解决方案

解决方案1
1 已采纳 2020-03-27 19:14:51

r：遍历几列的所有元素以检测短语

问题描述

1 个解决方案

解决方案1 1 已采纳 2020-03-27 19:14:51

解决方案1
1 已采纳 2020-03-27 19:14:51