r：遍历几列的所有元素以检测短语

Question

i am trying to loop through several columns in a dataframe which contain text files.我正在尝试遍历包含文本文件的数据框中的几列。

i want to check every entry of columns 7 through 16 to see if any of the text files contain a certain phrase.我想检查第 7 列到第 16 列的每个条目，以查看是否有任何文本文件包含某个短语。

each time the phrase is detected, i want to increase the count of times it appeared by 1.每次检测到短语时，我想将它出现的次数增加 1。

this seems pretty straightforward.这看起来很简单。 i think i should iterate through the columns and by the rows, but i just can't seem to figure out exactly how to do this.我想我应该遍历列和行，但我似乎无法确切地弄清楚如何做到这一点。

any suggestions?有什么建议？ thank you in advance for any insight.提前感谢您的任何见解。

fc_count <- 0

for (col in profiles[7:16]){
  for (row in 1:nrow(profiles)){

    if(isTRUE(grepl("my name is jeff", row)) == TRUE){

      fc_count = fc_count + 1

    }

  }

}

fc_count

Answer 1

We can use lapply to loop over the columns 7 to 16, apply grepl , with the pattern to get a list of logical vectors, Reduce , it to a single integer vector by adding ( + ) and then get the total value by sum我们可以使用lapply循环第 7 到 16 列，应用grepl ，使用pattern来获取逻辑向量list ， Reduce ，通过添加 ( + ) 将其转换为单个整数向量，然后通过sum获得sum

sum(Reduce(`+`, lapply(profiles[7:16], grepl, pattern = "my name is jeff")))

As grepl is vectorized for vector , if we convert the 'data.frame' to a matrix ( a matrix is a vector with dim attributes), it is more compact由于grepl被vector grepl vector ，如果我们将 'data.frame' 转换为matrix （ matrix是具有暗淡属性的向量），它会更紧凑

sum(grepl("my name is jeff", as.matrix(profiles[7:16])))

Also, with for loops, we don't need the nested loops as grepl is vectorized此外，对于for循环，我们不需要嵌套循环，因为grepl是矢量化的

fc_count <- 0
for(prf in profiles[7:16]){
    fc_count <- fc_count + sum(grepl("my name is jeff", prf))
 }

r：遍历几列的所有元素以检测短语

问题描述

1 个解决方案

解决方案1
1 已采纳 2020-03-27 19:14:51

r：遍历几列的所有元素以检测短语

问题描述

1 个解决方案

解决方案1 1 已采纳 2020-03-27 19:14:51

解决方案1
1 已采纳 2020-03-27 19:14:51