[英]how to get percentiles using mutate_at?
Consider this simple example 考虑这个简单的例子
> tibble(var1 = c(1,2,3,4,5),
+ boo1 = c(1,2,3,4,5))
# A tibble: 5 x 2
var1 boo1
<dbl> <dbl>
1 1 1
2 2 2
3 3 3
4 4 4
5 5 5
Here I want to get the current value expressed as a percentage using ecdf
. 在这里,我想使用ecdf
获取以百分比表示的当前值。 This is expected to work but it does not. 预期这会起作用,但不会。
> tibble(var1 = c(1,2,3,4,5),
+ boo1 = c(1,2,3,4,5)) %>%
+ mutate_at(vars(contains('boo')),
+ .funs = funs(ecdf(.)(.)))
Error in mutate_impl(.data, dots) :
Evaluation error: 'x' and 'y' lengths differ.
Instead, this works 相反,这有效
> tibble(var1 = c(1,2,3,4,5),
+ boo1 = c(1,2,3,4,5)) %>% mutate(percentile = ecdf(boo1)(boo1))
# A tibble: 5 x 3
var1 boo1 percentile
<dbl> <dbl> <dbl>
1 1 1 0.2
2 2 2 0.4
3 3 3 0.6
4 4 4 0.8
5 5 5 1
What is the issue here? 这是什么问题? Thanks! 谢谢!
Two alternatives would be 两种选择是
ecdfEval <- function(x) ecdf(x)(x)
tbl %>% mutate_at(vars(contains('boo')), ecdfEval)
# A tibble: 5 x 2
# var1 boo1
# <dbl> <dbl>
# 1 1 0.2
# 2 2 0.4
# 3 3 0.6
# 4 4 0.8
# 5 5 1
and 和
tbl %>% mutate_at(vars(contains('boo')), funs(do.call(ecdf(.), list(.))))
# A tibble: 5 x 2
# var1 boo1
# <dbl> <dbl>
# 1 1 0.2
# 2 2 0.4
# 3 3 0.6
# 4 4 0.8
# 5 5 1
It is indeed strange that your approach didn't work, seems like using .
您的方法似乎没有用,确实很奇怪.
in nested functions is a problem. 嵌套函数中的问题。
Edit (better option): It seems like it may be more about funs
than mutate_at
since, as @Nate noticed, we can indeed use simply 编辑 (更好的选择):看起来这可能是更多funs
比mutate_at
以来,作为@Nate注意到,我们的确可以用简单
tbl %>% mutate_at(vars(contains('boo')), .funs = ~ecdf(.)(.))
(See @Nate's comment below for more details.) (有关更多详细信息,请参见下面的@Nate评论。)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.