簡體   English   中英

R中for循環的矢量化

[英]Vectorization of a for-loop in R

我有兩個向量:

  • 文本向量c('abc', 'asdf', 'werd', 'ffssd')
  • 矢量模式c('ab', 'd', 'w')

我想矢量化以下for循環:

for(p in 1 : length(patterns)){
    count <- count + str_count(texts, p);
}

我使用了以下命令,但兩者都不起作用。

> str_count(texts, patterns)
[1] 1 1 1 0
Warning message:
In stri_count_regex(string, pattern, opts_regex = attr(pattern,  :
  longer object length is not a multiple of shorter object length

> str_count(texts, t(patterns))
[1] 1 1 1 0
Warning message:
In stri_count_regex(string, pattern, opts_regex = attr(pattern,  :
  longer object length is not a multiple of shorter object length

我想要一個像這樣的二維矩陣:

       |  patterns
 ------+--------
       |   1 0 0
 texts |   0 1 0
       |   0 1 1
       |   0 1 0

你可以用outer 我假設你正在使用str_countstringr包。

library(stringr)

texts <- c('abc', 'asdf', 'werd', 'ffssd')
patterns <- c('ab', 'd', 'w')

matches <- outer(texts, patterns, str_count)

# set dim names
colnames(matches) <- patterns
rownames(matches) <- texts
matches
      ab d w
abc    1 0 0
asdf   0 1 0
werd   0 1 1
ffssd  0 1 0

編輯

# or set names directly within 'outer' as noted by @RichardScriven
outer(setNames(nm = texts), setNames(nm = patterns), str_count)

使用dplyrtidyr (和stringr ):

library(dplyr)
library(tidyr)
library(stringr)
expand.grid(texts, patterns) %>%
   mutate_each(funs(as.character(.))) %>%
   mutate(matches = stringr::str_count(Var1, Var2)) %>% 
   spread(Var2, matches)
   Var1 ab d w
1   abc  1 0 0
2  asdf  0 1 0
3 ffssd  0 1 0
4  werd  0 1 1

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM