简体   繁体   English

可变长度不同的错误

[英]variable lengths differ error in a aggragate

I have some data that I would like to summarize: 我想总结一些数据:

    studentid friend Gfriend
214  30401006      0       0
236  30401006      0       0
208  30401006      1       0
229  30401006      0       0
207  30401006      0       0
278  30401007      1       0
250  30401007      1       0
266  30401007      1       0
254  30401007      1       1
277  30401007      1       1
243  30401007      1       1

result should look something like this: 结果应如下所示:

studentid friend Gfriend
30401006   1      0
30401007   6      3

When I try: agg=aggregate(c(friend)~studentid,data=df,FUN=sum) I get the required result (but only for the friend variable). 当我尝试: agg=aggregate(c(friend)~studentid,data=df,FUN=sum)我得到所需的结果(但仅适用于friend变量)。 But when I try: agg=aggregate(c(friend,Gfriend)~studentid,data=df,FUN=sum) I get: 但是当我尝试时: agg=aggregate(c(friend,Gfriend)~studentid,data=df,FUN=sum)我得到:

Error in model.frame.default(formula = c(friend, Gfriend) ~ studentid, : variable lengths differ (found for 'studentid') model.frame.default(formula = c(friend,Gfriend)〜studentid,中的错误:可变长度不同(为'studentid'找到)

I checked the lengths of the variables ( length(var) ) and they are all the same, plus there are no NA's so I have no idea where this error is coming from. 我检查了变量的长度(length(var)),它们都是一样的,再加上没有NA,所以我不知道这个错误是从哪里来的。

Why is this happening? 为什么会这样呢?

you could also try "by" 您也可以尝试“ by”

 studentid < c(30401006,30401006,30401006,30401006,30401006,30401007,
 + 30401007,30401007,30401007,30401007,30401007)
 friend <- c(0,0,1,0,0,1,1,1,1,1,1)
 Gfriend <- c(0,0,0,0,0,0,0,0,1,1,1)
 df <- data.frame(studentid,friend,Gfriend)
 df

 > result <- by(df[c(2:3)], df$studentid, FUN=colSums)

 > result
 df$studentid: 30401006
 friend Gfriend 
 1       0 
 df$studentid: 30401007
 friend Gfriend 
 6       3 

EDIT: added na.rm = T to address the comment about excluding NAs 编辑:添加na.rm = T以解决有关排除NA的评论

Check out the "plyr" package. 签出“ plyr”包。

library(plyr)

#split by "studentid" and sum all numeric colums 

ddply(df, .(studentid), numcolwise(sum, na.rm=T))

studentid friend Gfriend
1  30401006      1       0
2  30401007      6       3

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM