[英]variable lengths differ error in a aggragate
I have some data that I would like to summarize: 我想总结一些数据:
studentid friend Gfriend
214 30401006 0 0
236 30401006 0 0
208 30401006 1 0
229 30401006 0 0
207 30401006 0 0
278 30401007 1 0
250 30401007 1 0
266 30401007 1 0
254 30401007 1 1
277 30401007 1 1
243 30401007 1 1
result should look something like this: 结果应如下所示:
studentid friend Gfriend
30401006 1 0
30401007 6 3
When I try: agg=aggregate(c(friend)~studentid,data=df,FUN=sum)
I get the required result (but only for the friend variable). 当我尝试: agg=aggregate(c(friend)~studentid,data=df,FUN=sum)
我得到所需的结果(但仅适用于friend变量)。 But when I try: agg=aggregate(c(friend,Gfriend)~studentid,data=df,FUN=sum)
I get: 但是当我尝试时: agg=aggregate(c(friend,Gfriend)~studentid,data=df,FUN=sum)
我得到:
Error in model.frame.default(formula = c(friend, Gfriend) ~ studentid, : variable lengths differ (found for 'studentid') model.frame.default(formula = c(friend,Gfriend)〜studentid,中的错误:可变长度不同(为'studentid'找到)
I checked the lengths of the variables ( length(var) ) and they are all the same, plus there are no NA's so I have no idea where this error is coming from. 我检查了变量的长度(length(var)),它们都是一样的,再加上没有NA,所以我不知道这个错误是从哪里来的。
Why is this happening? 为什么会这样呢?
you could also try "by" 您也可以尝试“ by”
studentid < c(30401006,30401006,30401006,30401006,30401006,30401007,
+ 30401007,30401007,30401007,30401007,30401007)
friend <- c(0,0,1,0,0,1,1,1,1,1,1)
Gfriend <- c(0,0,0,0,0,0,0,0,1,1,1)
df <- data.frame(studentid,friend,Gfriend)
df
> result <- by(df[c(2:3)], df$studentid, FUN=colSums)
> result
df$studentid: 30401006
friend Gfriend
1 0
df$studentid: 30401007
friend Gfriend
6 3
EDIT: added na.rm = T
to address the comment about excluding NAs 编辑:添加na.rm = T
以解决有关排除NA的评论
Check out the "plyr" package. 签出“ plyr”包。
library(plyr)
#split by "studentid" and sum all numeric colums
ddply(df, .(studentid), numcolwise(sum, na.rm=T))
studentid friend Gfriend
1 30401006 1 0
2 30401007 6 3
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.