[英]Inconsistent behaviour between sample_n() and slice_sample()
I have come across a simple but tricky question when trying to use slice_sample()
to replace its predecessor sample_n()
in a map()
function.在尝试使用slice_sample()
替换其前身sample_n()
时,我遇到了一个简单但棘手的问题map()
function。
I am trying to replicate an example^ which samples the mtcar
dataset with 1, 2, and 3 rows.我正在尝试复制一个示例^, mtcar
数据集进行 1、2 和 3 行采样。
Run example code with sample_n()
:使用sample_n()
运行示例代码:
map(c(1, 2, 3), sample_n, tbl = mtcars)
I get:我得到:
[[1]]
mpg cyl disp hp drat wt qsec vs am gear carb
Fiat 128 32.4 4 78.7 66 4.08 2.2 19.47 1 1 4 1
[[2]]
mpg cyl disp hp drat wt qsec vs am gear carb
Cadillac Fleetwood 10.4 8 472 205 2.93 5.250 17.98 0 0 3 4
Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
[[3]]
mpg cyl disp hp drat wt qsec vs am gear carb
Merc 280 19.2 6 167.6 123 3.92 3.440 18.30 1 0 4 4
Merc 280C 17.8 6 167.6 123 3.92 3.440 18.90 1 0 4 4
Toyota Corona 21.5 4 120.1 97 3.70 2.465 20.01 1 0 3 1
But when I try the slice_sample()
function:但是当我尝试slice_sample()
function 时:
map(c(1, 2, 3), slice_sample, .data = mtcars)
I get:我得到:
Error in `map()`:
ℹ In index: 1.
Caused by error in `.f()`:
! `n` must be explicitly named.
ℹ Did you mean `slice_sample(n = 1)`?
Run `rlang::last_error()` to see where the error occurred.
sample_n() and sample_frac() have been superseded in favour of slice_sample(). sample_n() 和 sample_frac() 已被 slice_sample() 取代。
... These functions were superseded because we realised it was more convenient to have two mutually exclusive arguments to one function, rather than two separate functions. ...这些功能被取代是因为我们意识到将两个相互排斥的 arguments 转换为一个 function 比使用两个单独的功能更方便。
I have read through both help pages and did a series of experiments but didn't go very far.我已经通读了两个帮助页面并进行了一系列实验,但没有 go 走得太远。 Deeper down my heart I knew it's definitely possible - could anyone give me a hint?在我内心深处,我知道这绝对有可能 - 谁能给我一个提示?
^: Page 217, Chapter 8, Beyond Spreadsheets with R: A beginner's guide to R and Rstudio ^:第 217 页,第 8 章,使用 R 超越电子表格:R 和 Rstudio 的初学者指南
This is because这是因为
! !
n
must be explicitly named.n
必须明确命名。
In slice_sample
, you have to specify either n
or prop
, otherwise it'll throw an error, like here.在slice_sample
中,您必须指定n
或prop
,否则它会抛出错误,就像这里一样。 In your case, you can use an anonymous function to get the expected output:在您的情况下,您可以使用匿名 function 来获得预期的 output:
map(c(1, 2, 3), ~ slice_sample(n = .x, mtcars))
In general, it is more appropriate to use anonymous functions rather than ...
in map
functions.一般来说,使用匿名函数比...
在map
函数中更合适。 As mentioned in purrr
documentation , it can avoid confusing situations:正如purrr
文档中提到的,它可以避免混淆的情况:
We also recommend using an anonymous function instead of passing additional arguments to map. This avoids a certain class of moderately esoteric argument matching woes and, we believe, is generally easier to read.我们还建议使用匿名 function,而不是将额外的 arguments 传递给 map。这避免了某个 class 的适度深奥的参数匹配问题,我们相信,通常更容易阅读。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.