[英]Vectorization of a for-loop in R
我有兩個向量:
c('abc', 'asdf', 'werd', 'ffssd')
c('ab', 'd', 'w')
我想矢量化以下for循環:
for(p in 1 : length(patterns)){
count <- count + str_count(texts, p);
}
我使用了以下命令,但兩者都不起作用。
> str_count(texts, patterns)
[1] 1 1 1 0
Warning message:
In stri_count_regex(string, pattern, opts_regex = attr(pattern, :
longer object length is not a multiple of shorter object length
> str_count(texts, t(patterns))
[1] 1 1 1 0
Warning message:
In stri_count_regex(string, pattern, opts_regex = attr(pattern, :
longer object length is not a multiple of shorter object length
我想要一個像這樣的二維矩陣:
| patterns
------+--------
| 1 0 0
texts | 0 1 0
| 0 1 1
| 0 1 0
你可以用outer
。 我假設你正在使用str_count
從stringr
包。
library(stringr)
texts <- c('abc', 'asdf', 'werd', 'ffssd')
patterns <- c('ab', 'd', 'w')
matches <- outer(texts, patterns, str_count)
# set dim names
colnames(matches) <- patterns
rownames(matches) <- texts
matches
ab d w
abc 1 0 0
asdf 0 1 0
werd 0 1 1
ffssd 0 1 0
編輯
# or set names directly within 'outer' as noted by @RichardScriven
outer(setNames(nm = texts), setNames(nm = patterns), str_count)
使用dplyr
和tidyr
(和stringr
):
library(dplyr)
library(tidyr)
library(stringr)
expand.grid(texts, patterns) %>%
mutate_each(funs(as.character(.))) %>%
mutate(matches = stringr::str_count(Var1, Var2)) %>%
spread(Var2, matches)
Var1 ab d w
1 abc 1 0 0
2 asdf 0 1 0
3 ffssd 0 1 0
4 werd 0 1 1
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.