Bin dataframe by row number within groups in R

Question

My data consists of word lists from different texts (the group variable), and I'm trying to bin the dataframe within each group by a certain number of rows (every 2000 rows).

My data look like this:

index   text   word
1       H6     mællte
2       H6     fleiru
...
66265   H6     han
1       DG8    Son
2       DG8    hins
3       DG8    var
...
2001    DG8    faer
2002    DG8    hælga

I would like it to look like this:

index   text   word     bin
1       H6     mællte   1
2       H6     fleiru   1
...
66265   H6     han      33
1       DG8    Son      1
2       DG8    hins     1
3       DG8    var      1
...
2001    DG8    faer     2
2002    DG8    hælga    2

Answer 1

We can use rep with dplyr :

library(dplyr)

df %>%
  group_by(text) %>%
  mutate(bin = rep(1:ceiling(n()/2000), each = 2000, length.out = n()))

length.out = n() makes sure that if n() is not divisible by 2000 , the last "bin" value will repeat only up till the Nth row per group.

Bin dataframe by row number within groups in R

Question

1 answers

solution1
0 ACCPTED 2018-08-20 19:58:57

Bin dataframe by row number within groups in R

Question

1 answers

solution1 0 ACCPTED 2018-08-20 19:58:57

solution1
0 ACCPTED 2018-08-20 19:58:57