R - create new dataframes by random selection of consisting dataframes

Question

i'm a R-beginner and i have a little problem. I want to create new dataframes by a random selection of consisting dataframes.

I have 4 (categories), each divided into 10 dataframes and i want to create 10 new dataframes, containing 1 dataframe from each category.

For example, these are my dataframes:

Cat_1_Data_1 Cat_2_Data_1 Cat_3_Data_1 Cat_4_Data_1 Cat_1_Data_2 Cat_2_Data_2 Cat_3_Data_2 Cat_4_Data_2 Cat_1_Data_3 Cat_2_Data_3 Cat_3_Data_3 Cat_4_Data_3 Cat_1_Data_4 Cat_2_Data_4 Cat_3_Data_4 Cat_4_Data_4 Cat_1_Data_5 Cat_2_Data_5 Cat_3_Data_5 Cat_4_Data_5 Cat_1_Data_6 Cat_2_Data_6 Cat_3_Data_6 Cat_4_Data_6 Cat_1_Data_7 Cat_2_Data_7 Cat_3_Data_7 Cat_4_Data_7 Cat_1_Data_8 Cat_2_Data_8 Cat_3_Data_8 Cat_4_Data_8 Cat_1_Data_9 Cat_2_Data_9 Cat_3_Data_9 Cat_4_Data_9 Cat_1_Data_10 Cat_2_Data_10 Cat_3_Data_10 Cat_4_Data_10

Creating new dataframes (that's how i do it):

new_data_1 <- rbind(cat_1_data_1,cat_2_data_1,cat_3_data_1,cat_4_data_1) ... new_data_10 <- rbind(cat_1_data_10,cat_2_data_10,cat_3_data_10,cat_4_data_10)

But i want a random pick of the datasets, like:

new_data_1 <- rbind(cat_1_data_[Random 1-10],cat_2_data_[Random 1-10]... and so on) ... new_data_10 <- rbind(cat_1_data_[Random 1-10],cat_2_data_[Random 1-10]...and so on)

Is there any possibility to solve this problem? Actually i don't know how to approach this problem :(

Answer 1

Here is one sampling strategy that would work.

Create lists of your data.frame s, one per category shuffling them as you go:

dflist.cat1 <- sample(list(Cat_1_Data_1, Cat_1_Data_2, ...))
dflist.cat2 <- sample(list(Cat_2_Data_1, Cat_2_Data_2, ...))
...

Run lapply to rbind the corresponding element of each list. This will result in a list of length 10:

dflist.new <- lapply(1:10, function(i){
                             rbind(dflist.cat1[[i]], 
                                   dflist.cat2[[i]],
                                   dflist.cat3[[i]],
                                   dflist.cat4[[i]])
                           })

You can access your data.frame s using dflist.new[[1]] for the first one, and so on.

I am sure there is a more elegant way to do this with 2-dimensional list indices, but this works well for a small number of categories.

R - create new dataframes by random selection of consisting dataframes

Question

1 answers

solution1
2 ACCPTED 2014-04-30 08:25:14

R - create new dataframes by random selection of consisting dataframes

Question

1 answers

solution1 2 ACCPTED 2014-04-30 08:25:14

solution1
2 ACCPTED 2014-04-30 08:25:14