简体   繁体   中英

Using custom function with ddply

For some reason I cannot use a custom function with ddply. It returns exactly the same data frame.

Basically, I want not to count the number of duplicates of id, but actually create a variable that says if it is the first, second, or third instance of that repetition of id. Wrote a function for that , create_guide, which works; but does not work with the id groups.

df<-data.frame(id=c(1,1,2,2,3,4))

create_guide <- function(dt) {

  guide <- rep(0,times=nrow(dt))

  for (i in 1:nrow(dt)) {
    guide[i] <- length(dt[1:i,1])
  }

  a <- cbind(guide,dt)

}

bi <- plyr::ddply(df,.(id),fun=create_guide)

What is happening? Thank you

You misspelled the argument name: it's .fun , not fun . You can also omit it:

bi <- ddply(df, .(id), .fun = create_guide)
# or
bi <- ddply(df, .(id), create_guide)

Furthermore, your function can be drastically simplified, since your loop body is merely a convoluted way of assigning consecutive numbers:

create_guide = function(dt) {
    cbind(guide = seq_len(nrow(dt)), dt)
}

(Incidentally, it took me a substantial amount of time to simplify the function down to this single line because I couldn't understand what it was doing — that's how complex the code was.)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM