简体   繁体   中英

using ddply() function with in for loop in R

My question is regarding how to use ddply in a for loop for example.

x<-ddply(data, "variable_name", summarize, event= sum(x)/count(x))

This is normal ddply but what if want I in space of variable_name as in the following example:

**

data

** 1. col1,col2,col3,col4
ax 10 1
ax 02 2
ax 40 3
bx 05 8
by 01 10
by 08 6
by 10 8
by 50 6

for(i in 1:2){ result[i]<-ddply(data, name(data[,i]), summarize, event=sum(col3)/count(col4)) }

**

output desired:

** result[col3]

 1. col1 event  
     a    17.33  
     b 14.80

result[col4] 

 1. col2 event  
     x    14.25  
     y    17.25

You can always do this with this sort of method (which some may consider hacky):

for(i in 1:ncol(data)) {
     q <-sprintf("x <- ddply(data, .(%s), summarize, event=sum(x)/count(x))", 
              names(data)[i]) 
     parse(eval(text = q)) 
}

By sum(.) / count(.) do you mean the average? I think summarise will not work with count . If you just want the average I suggest you use mean and what you want can be achieved like

lapply(c("cyl", "gear"), function(var) ddply(mtcars, var, summarize, mean(mpg)))
#[[1]]
#  cyl      ..1
#1   4 26.66364
#2   6 19.74286
#3   8 15.10000
#
#[[2]]
#  gear      ..1
#1    3 16.10667
#2    4 24.53333
#3    5 21.38000

Or equivalently if you want to make use of names and indices you can replace the first argument with

lapply(names(mtcars)[c(2,10)], ...)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM