How to group according to position in a vector using dplyr

Question

I want to count the frequency of certain terms in a window of every 10 words in a vector of single words:

An example is

mywords<-sample(c("POS","NNTD","DD","HG","KKL"),10000 replace = TRUE)
mywords<-data.frame(mywords)
names(mywords)<-c("TheTerms")

I want to get the frequency of each term every 10 terms. I imagine this can be done in dplyr

mywords%>%group_by(TheTerms)%>%summarise(n=n())

but how do I get this done very 10 words?

Answer 1

Here is an idea,

library(dplyr)

 mywords %>% 
  group_by(grp = rep(seq(n()/10), each = 10)) %>% 
  count(TheTerms)

which gives,

 A tibble: 4,500 x 3 # Groups: grp [1,000] grp TheTerms n <int> <fctr> <int> 1 1 DD 3 2 1 HG 4 3 1 POS 3 4 2 DD 1 5 2 HG 1 6 2 KKL 3 7 2 NNTD 4 8 2 POS 1 9 3 HG 1 10 3 KKL 3 # ... with 4,490 more rows

Answer 2

Another option is data.table

library(data.table)
setDT(mywords)[, .N,.(TheTerms, grp = as.integer(gl(nrow(mywords), 10, nrow(mywords))))]

Answer 3

In base R, you could use table like this:

table(rep(seq_along(mywords$TheTerms), each=10, length.out=nrow(mywords)), mywords$TheTerms)

     DD HG KKL NNTD POS
  1   2  0   2    2   4
  2   3  2   4    0   1
  3   3  1   1    3   2
  4   4  3   1    1   1
  5   0  6   3    1   0
  6   1  2   1    3   3
  7   2  3   1    2   2
  8   4  2   1    1   2
  9   2  1   4    1   2
  10  3  1   2    2   2

I switched the sample size to 100 for display purposes.

How to group according to position in a vector using dplyr

Question

3 answers

solution1
3 ACCPTED 2017-08-07 13:30:27

solution2
2 2017-08-07 14:43:44

solution3
1 2017-08-07 13:46:32

How to group according to position in a vector using dplyr

Question

3 answers

solution1 3 ACCPTED 2017-08-07 13:30:27

solution2 2 2017-08-07 14:43:44

solution3 1 2017-08-07 13:46:32

solution1
3 ACCPTED 2017-08-07 13:30:27

solution2
2 2017-08-07 14:43:44

solution3
1 2017-08-07 13:46:32