简体   繁体   中英

Is there an R function to select N random columns from the dataframe?

Is there an R function to select N random columns from the dataframe? I'am trying to check the time complexity of Sparsebn package for structure learning of Bayesian Networks

I've tried this, but the algorithm selects not only N columns, but also N rows. How to fix that?

library(sparsebn)
library(igraph)
library(graph)

df <- read.csv("data/arth150.csv", header = TRUE, sep = ",", check.names = FALSE)

df <- as.data.frame(unclass(df), stringsAsFactors = TRUE)

experiment_range <- list(10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 106)

timelist <- list()

for (i in experiment_range) {
  rand_df <- df[sample(ncol(df), size=i), ]
  start_time <- Sys.time()
  dat <- sparsebnData(rand_df, type = 'c')
  dags <- estimate.dag(data = dat)
  end_time <- Sys.time()
  ctime <- end_time - start_time
  otime <- list(ctime)
  timelist <- append(timelist, otime)
}

If df is a dataframe, you can sample i columns randomly by doing this:

df[,sample(1:ncol(df),i)]

Or using dplyr :

dplyr::select(df, sample(seq_len(ncol(df)), size = i))

In a pipe:

df %>% dplyr::select(sample(seq_len(ncol(.)), size = i))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM