[英]R's tapply with null function
I'm having trouble understanding what tapply
function does when the FUN
argument is null
. 当
FUN
参数为null
时,我无法理解tapply
函数的作用。
The documentation says: 文件说:
If FUN is NULL, tapply returns a vector which can be used to subscript the multi-way array tapply normally produces.
如果FUN为NULL,则tapply返回一个向量,该向量可用于下标tapply正常生成的多路数组。
For example, what does the following example of the documentation do? 例如,下面的文档示例是做什么的?
ind <- list(c(1, 2, 2), c("A", "A", "B"))
tapply(1:3, ind) #-> the split vector
I don't understand the results: 我不明白结果:
[1] 1 2 4
Thanks. 谢谢。
If you run tapply
with a specified function (not NULL), say sum
, like in help, you'll see that the result is a 2-dimensional array with NA
in one cell: 如果你使用指定的函数运行
tapply
(不是NULL),比如sum
,就像在help中一样,你会看到结果是一个单元格中带有NA
的二维数组:
res <- tapply(1:3, ind, sum)
res
A B
1 1 NA
2 2 3
It means that one combination of factors, namely (1, B), is absent. 这意味着不存在一个因素组合,即(1,B)。 When FUN is NULL, it returns a vector indices corresponding to all present factor combinations.
当FUN为NULL时,它返回对应于所有当前因子组合的向量索引。 To check this:
要检查一下:
> which(!is.na(res))
[1] 1 2 4
One thing to mention, the specified function can return NA's itself, like in the following toy example: 有一点需要注意的是,指定的函数可以返回NA本身,如下面的玩具示例所示:
> f <- function(x){
if(x[[1]] == 1) return(NA)
return(sum(x))
}
> tapply(1:3, ind, f)
A B
1 NA NA
2 2 3
So, in general, NA doesn't mean that a factor combination is absent. 因此,一般而言,NA并不意味着不存在因子组合。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.