简体   繁体   中英

using lapply() with multiple variables

I've got a cross-tab frequency table where the measure is CAG and columns A01, A02 etc are frequency counts. ie 6485 counts of 13 CAG, 35 counts of CAG 14. I want to sum the values in each column, provided the CAG for that row is greater than or equal to the modal CAG value. Then I will divide that by the sum of A01. This provides me the proportion of values that are greater than or equal to the mode. I've managed to get it working for one column, but I want to run it over each column, using the relevant mode for each column. I'd appreciate any help!

data <- data.frame(CAG = c(13, 14, 15, 17), 
                   A01 = c(6485,35,132, 12), 
                   A02 = c(0,42,56, 4))

mode <- data$CAG[data$A01 == max(data$A01)]

B <- lapply(data[, 2:ncol(data)], function(x) {
    sum(x[data$CAG >= mode])
})

prop <- B / sum(data$A01)

You need to put the mode calculation in the function too.

sapply(data[, 2:ncol(data)], function(x) {
  mode <- data$CAG[which.max(x)]
  B <- sum(x[data$CAG >= mode])
  B/sum(x)
})
##       A01       A02 
## 1.0000000 0.5882353 

The function which.max is equivalent (at least in this use) to x==max(x) .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM