I'm working with programming language R on a dataframe ( data
) that look like this:
ID t P1 P2 P3 P4
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 100003 0 5 4 3 2
2 100003 0 6 2 1 3
3 100013 0 6 5 7 3
4 100013 0 4 5 4 1
5 100014 0 1 1 1 1
6 100014 0 1 1 1 1
7 100015 0 6 6 1 1
8 100015 0 6 6 1 1
9 100044 0 6 2 5 1
10 100044 0 6 3 1 1
11 100051 0 NA NA NA NA
12 100051 0 4 4 2 2
13 100074 0 4 6 4 3
14 100074 0 5 6 3 2
15 100075 0 2 2 1 1
AIM: I need to aggregate by ID (t is always equal to 0) for each variable from P1,P2,P3,P4 like this:
new_data<-aggregate(P1~ID+t,data,mean,na.rm=T)
new_data<-aggregate(P2~ID+t,data,mean,na.rm=T)
new_data<-aggregate(P3~ID+t,data,mean,na.rm=T)
new_data<-aggregate(P4~ID+t,data,mean,na.rm=T)
PROBLEM: Is there a loop I can run or some code from the apply family instead of going through each variable (P1-P4) manually. Thanks a lot!
Haven't tested it, but this should do the loop:
cols<-c("P1","P2","P3","P4")
dat2<-lapply(data[cols],function(x){
aggregate(x~ID+t, data, mean, na.rm=T)
})
You can aggregate multiple variables at once with cbind(P1, P2, P3, P4) ~ ID + t
or equivalently using a dot in place of cbind(P1, P2, P3, P4)
. The dot means every remaining variable.
> aggregate(. ~ ID + t, old.data, mean,na.rm=T)
ID t P1 P2 P3 P4
1 100003 0 5.5 3.0 2.0 2.5
2 100013 0 5.0 5.0 5.5 2.0
3 100014 0 1.0 1.0 1.0 1.0
4 100015 0 6.0 6.0 1.0 1.0
5 100044 0 6.0 2.5 3.0 1.0
6 100051 0 4.0 4.0 2.0 2.0
7 100074 0 4.5 6.0 3.5 2.5
8 100075 0 2.0 2.0 1.0 1.0
>
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.