My question is regarding how to use ddply
in a for
loop for example.
x<-ddply(data, "variable_name", summarize, event= sum(x)/count(x))
This is normal ddply
but what if want I in space of variable_name
as in the following example:
**
data
** 1. col1,col2,col3,col4
ax 10 1
ax 02 2
ax 40 3
bx 05 8
by 01 10
by 08 6
by 10 8
by 50 6
for(i in 1:2){ result[i]<-ddply(data, name(data[,i]), summarize, event=sum(col3)/count(col4)) }
**
output desired:
** result[col3]
1. col1 event
a 17.33
b 14.80
result[col4]
1. col2 event
x 14.25
y 17.25
You can always do this with this sort of method (which some may consider hacky):
for(i in 1:ncol(data)) {
q <-sprintf("x <- ddply(data, .(%s), summarize, event=sum(x)/count(x))",
names(data)[i])
parse(eval(text = q))
}
By sum(.) / count(.)
do you mean the average? I think summarise
will not work with count
. If you just want the average I suggest you use mean
and what you want can be achieved like
lapply(c("cyl", "gear"), function(var) ddply(mtcars, var, summarize, mean(mpg)))
#[[1]]
# cyl ..1
#1 4 26.66364
#2 6 19.74286
#3 8 15.10000
#
#[[2]]
# gear ..1
#1 3 16.10667
#2 4 24.53333
#3 5 21.38000
Or equivalently if you want to make use of names
and indices you can replace the first argument with
lapply(names(mtcars)[c(2,10)], ...)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.