简体   繁体   中英

How to combine different dataframes randomly by rows in R

I have the dataframes Relaxed.swimming , Intense.swimming , Resting and Burst . They share the number of columns (4 columns) but they differ in the number of rows. As an example:

Relaxed.swimming <- data.frame(Behaviour= "Relaxed.swimming",
                               disurge=c(0.015,0.908,0.345,0.489),
                               diheave=c(0.398,0.782,0.198,0.634),
                               disway=c(0.491,0.398,0.189,0.592))

Intense.swimming <- data.frame(Behaviour= "Intense.swimming",
                               disurge=c(0.015,0.908,0.345),
                               diheave=c(0.398,0.782,0.198),
                               disway=c(0.491,0.398,0.189))


Burst <- data.frame(Behaviour= "Burst",
                    disurge=c(0.015,0.908),
                    diheave=c(0.398,0.782),
                    disway=c(0.491,0.398))

Resting <- data.frame(Behaviour= "Resting",
                      disurge=c(0.015,0.908,0.345),
                      diheave=c(0.398,0.782,0.198),
                      disway=c(0.491,0.398,0.189))

I just want to combine them by rows (keeping the 4 columns). The point is that I want to combine them hundreds or thousands of times and I want to combine them randomly, that is, the order changes constantly (ie rbind(Relaxed.swimming, Intense.swimming, Resting, Burst, Resting, Intense.swimming, Relaxed.swimming, Resting, etc) ). Although I want to combine them randomly, I want to keep proportions (the four vectors are replicate approximately the same number of times). The ratios don't have to be exactly 1:1:1:1 but they should be close.

I would like to get something like that:

> df
          Behaviour disurge diheave disway
1           Resting   0.015   0.398  0.491
2           Resting   0.908   0.782  0.398
3           Resting   0.345   0.198  0.189
4             Burst   0.015   0.398  0.491
5             Burst   0.908   0.782  0.398
6  Intense.swimming   0.015   0.398  0.491
7  Intense.swimming   0.908   0.782  0.398
8  Intense.swimming   0.345   0.198  0.189
9  Relaxed.swimming   0.015   0.398  0.491
10 Relaxed.swimming   0.908   0.782  0.398
11 Relaxed.swimming   0.345   0.198  0.189
12 Relaxed.swimming   0.489   0.634  0.592
13            Burst   0.015   0.398  0.491
14            Burst   0.908   0.782  0.398
15 Relaxed.swimming   0.015   0.398  0.491
16 Relaxed.swimming   0.908   0.782  0.398
17 Relaxed.swimming   0.345   0.198  0.189
18 Relaxed.swimming   0.489   0.634  0.592
.          .            .       .      .
.          .            .       .      .
.          .            .       .      .

How can I get a large data frame obtained from the random replication of the 4 mentioned dataframes?

Does anyone know how to do it?

Thanks in advance

If the proportions need not be 100% identical then this dplyr solution should work:

First row-bind the four dataframes together:

library(dplyr)
All <- rbind(Relaxed.swimming, Intense.swimming, Burst, Resting)

Then group them by Behavior and draw a random sample of any size. Random samples normally keep internal proportions intact:

All_s <- All %>% sample_n(1000, replace = T)

All_s[1:10,]
          Behaviour disurge diheave disway
1  Intense.swimming   0.015   0.398  0.491
2           Resting   0.345   0.198  0.189
3             Burst   0.345   0.198  0.189
4  Relaxed.swimming   0.345   0.198  0.189
5  Intense.swimming   0.489   0.634  0.592
6             Burst   0.345   0.198  0.189
7  Relaxed.swimming   0.345   0.198  0.189
8           Resting   0.489   0.634  0.592
9           Resting   0.015   0.398  0.491
10 Intense.swimming   0.241   0.241  0.241 

try to do so

library(tidyverse)
df_list <- list(Relaxed.swimming, Intense.swimming, Burst, Resting)

sample(df_list, 1, size = 10) %>% bind_rows()

The answers so far might not do as much shuffling as the question asked for. From the example, desired output, it seems like the final result should have a bit more shuffling Eg, the dataframe Burst has three rows but in the example output there are only two rows with Burst next to each other. This function replicates the list of dataframes, combines them in a random order, and then optionally shuffles the rows one more time.

random_replicate <- function(list_of_dataframes, n = 2, extra_shuffle = TRUE){
  n_frames <- length(list_of_dataframes)
  replicated <- replicate(n, do.call(rbind, sample(frames, n_frames)), simplify = FALSE)
  combined <- do.call(rbind, replicated)
  if (extra_shuffle) combined <- combined[sample.int(nrow(combined)),]
  return(combined)
}
list_of_dataframes <- list(Relaxed.swimming, Intense.swimming, Burst, Resting)

random_replicate(list_of_dataframes, 2)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM