I am trying to randomize a rather large matrix by row, however I need to keep same numbers in a particular column together.
For example:
# Table A
Column A Column B
0.1 1
0.6 1
1.5 1
23 2
18 2
0.5 2
0.6 3
19 3
0.7 3
My goal is to randomize by group, in this example by Column B
. I have tried sample.int(nrow(x))
, which worked fine to randomize all of the matrix, but is there a way to do this by group?
A very straightforward approach would be to use "data.table", like this:
> library(data.table)
> as.data.table(mydf)[, .(Column_B = sample(Column_A)), by = Column_B]
Column_B Column_B
1: 1 0.6
2: 1 1.5
3: 1 0.1
4: 2 23.0
5: 2 18.0
6: 2 0.5
7: 3 0.6
8: 3 0.7
9: 3 19.0
Or, more generally:
as.data.table(mydf)[, sample(.SD), by = Column_B]
Similarly, with "dplyr":
library(dplyr)
mydf %>%
group_by(Column_B) %>%
mutate(Column_A = sample(Column_A))
Without conversion to data.frame/data.table and without external packages you could use ?ave
combined with ?sample
:
mymat[ave(seq_along(mymat[, "Col_A"]), mymat[, "Col_B"], FUN = sample),]
sample data:
set.seed(123)
mymat <- cbind(Col_A = rnorm(9), Col_B = rep(1:3, each = 3))
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.