简体   繁体   中英

Randomizing rows in a matrix but keeping groups together in R

I am trying to randomize a rather large matrix by row, however I need to keep same numbers in a particular column together.

For example:

# Table A
Column A       Column B
     0.1              1
     0.6              1
     1.5              1
      23              2
      18              2
     0.5              2
     0.6              3
      19              3
     0.7              3

My goal is to randomize by group, in this example by Column B . I have tried sample.int(nrow(x)) , which worked fine to randomize all of the matrix, but is there a way to do this by group?

A very straightforward approach would be to use "data.table", like this:

> library(data.table)
> as.data.table(mydf)[, .(Column_B = sample(Column_A)), by = Column_B]
   Column_B Column_B
1:        1      0.6
2:        1      1.5
3:        1      0.1
4:        2     23.0
5:        2     18.0
6:        2      0.5
7:        3      0.6
8:        3      0.7
9:        3     19.0

Or, more generally:

as.data.table(mydf)[, sample(.SD), by = Column_B]

Similarly, with "dplyr":

library(dplyr)

mydf %>%
  group_by(Column_B) %>%
  mutate(Column_A = sample(Column_A))

Without conversion to data.frame/data.table and without external packages you could use ?ave combined with ?sample :

mymat[ave(seq_along(mymat[, "Col_A"]), mymat[, "Col_B"], FUN = sample),]

sample data:

set.seed(123)
mymat <- cbind(Col_A = rnorm(9), Col_B = rep(1:3, each = 3))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM