[英]Computing multiple variance of a dataset in R
My problem is somewhat related to this question. 我的问题与这个问题有关。
I have a data as below 我有如下数据
V1 V2
.. 1
.. 2
.. 1
.. 3
I need to calculate variance of data in V1
for each value of V2
cumulatively (This means that for a particular value of V2
say n
,all the rows of V1
having corresponding V2
less than n
need to be included. 我需要针对
V2
每个值累计计算V1
的数据方差(这意味着,对于V2
的特定值,请说n
,必须包括V1
对应的所有V2
小于n
所有行。
Will ddply
help in such a case? ddply
在这种情况下会ddply
帮助吗?
I don't think ddply
will help since it is built on the concept of taking non-overlapping subsets of a data frame. 我认为
ddply
不会有所帮助,因为它建立在采用数据帧的非重叠子集的概念上。
d <- data.frame(V1=runif(1000),V2=sample(1:10,size=1000,replace=TRUE))
u <- sort(unique(d$V2))
ans <- sapply(u,function(x) {
with(d,var(V1[V2<=x]))
})
names(ans) <- u
I don't know if there's a more efficient way to do this ... 我不知道是否有更有效的方法来做到这一点...
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.