In R: How do I loop through multiple columns and use a custom made function that takes in an argument from each of those columns and modifies those columns accordingly?
For example I have the following dataframe:
> head(runTimeSep)
hours h minutes min
1 70 min NA <NA>
2 21 min NA <NA>
3 106 min NA <NA>
4 75 min NA <NA>
5 14 min NA <NA>
6 82 min NA <NA>
7 1 h 11 min
my goal is to obtain a list of total minutes in the hours column. If "1h" is listed in the hours and h column, then convert hours to minutes and add on the minutes from the minutes column (or add nothing is it's a perfect hour with NA in the minutes column).
Therefore I have created the following function to apply to the dataframe:
# convert hours to minutes function
hoursToMins = function(hours, h, minutes, min) {
if (h == 'h' && min == "min") {
(hours = as.numeric(hours)*60+as.numeric(minutes))
}
if (h=="h" && min != "min") {
(hours = as.numeric(hours)*60)
}
}
How do I apply this function across all columns in the data frame? Eg. with lapply, ddpply, etc.
Edit: I also attempted the following:
finalRunTime = ifelse(runTimeSep$h == "h", runTimeSep$hours*60, runTimeSep$hours)
head(finalRunTime)
runTimeSep$hours = finalRunTime
which worked fine. But when I tried to apply the second round of ifelse:
finalRunTime = ifelse(runTimeSep$min == "min", runTimeSep$hours + runTimeSep$minutes, runTimeSep$hours)
head(finalRunTime)
runTimeSep$hours = finalRunTime
the 2nd round causes the else case (if there's no minute column) to become NA. Please help. Thanks.
In response to @Sandipan's answer: How do I use which to discriminate whether the min column is 'min' or NA?
I tried:
indices <- which(runTimeSep$h == 'h' && runTimeSep$min != 'min')
runTimeSep[indices,]$hours <- 60*runTimeSep[indices, ]$hours
indices <- which(runTimeSep$h == 'h' && runTimeSep$min == 'min')
runTimeSep[indices,]$hours <- 60*runTimeSep[indices, ]$hours +
runTimeSep[indices,]$minutes
However both sets of indices returned empty sets.
This would give you a vector of minutes by row and if you wanted its total, then just wrap sum()
around it:
with( dat, (h=="h")*60*hours + (h=="min")*hours +
ifelse( is.na(minutes), 0, minutes) )
[1] 70 21 106 75 14 82 71
It substitutes 0 for NA when minutes is NA. When a new column with those values is desired you can do this:
dat$newmins <- with( dat, (h=="h")*60*hours + (h=="min")*hours +
ifelse( is.na(minutes), 0, minutes) )
you want something like this:
indices <- which(runTimeSep$h == 'h')
runTimeSep[indices,]$hours <- 60*runTimeSep[indices, ]$hours +
runTimeSep[indices,]$minutes
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.