简体   繁体   中英

Split the data in R, split into percentage

I have a dataset corresponding to different types datasets. Then how it is possible to calculate case.

Data should be split into one case: 1) First Case - 15% of train data & 5% test

How to write it correctly?

Without createDataPartition , an easy way will be as follows.

Suppose you want train_prop as training set and test_prop as test set from the dataset my_dataset . Ideally, their sum will be 1 , or 1-val_prop , but here you want 15% and 5% for some reason. So you'll need 0.15 and 0.05 respectively.

indices <- sample(x = rep.int(x = c(0, 1, 2),
                  times = round(nrow(my_dataset) * c(1 - train_prop - test_prop, train_prop, test_prop))))
train_set <- my_dataset[indices == 1,]
test_set <- my_dataset[indices == 2,]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM