简体   繁体   中英

How do I assign a random seed to the dplyr sample_n function?

This is the "sample_n" from dplyr in R.
https://dplyr.tidyverse.org/reference/sample.html

For reproducibility, I should place a seed so that someone else can get my exact results.

Is there a built-in way to set the seed for "sample_n"? Is this something that I do in the environment and "sample_n" responds to it?

These are not built-into the "sample_n" function.

  • There is the environment "set.seed" function [1]
  • There is a library 'withr' that creates a seed-containing wrapper for code [2]

.

The dplyr::sample_n documentation tells that:

This is a wrapper around sample.int() to make it easy to select random rows from a table. It currently only works for local tbls.

so behind sample_n , sample.int is called, which means that the standard Random Number Generator is used, and that you can use set.seed for reproducibility.

Does this example help? In it, I am using set.seed and the mtcars dataset.

set.seed(1)
x <- mtcars
sample_n(x, 10)

sample_n(x, 10) #without set.seed()

set.seed(1)
x <- mtcars
sample_n(x, 10)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM