Cyclists are recording pedalling power over time and analyse a curve 1 derived from this that plots for every given time interval the amount of power at least produced: for 20 continuous minutes, you maintained 248W or more. I'd like to compute this in R starting from small time intervals dt and corresponding power pwr :
df <- data.frame(dt = rnorm(15,2,1), pwr = rnorm(15,250,50))
A simple but inaccurate way is this:
library(ggplot2)
df <- data.frame(dt = rnorm(15,2,1), pwr = rnorm(15,250,50))
df <- df[with(df, order(-pwr)),]
df$s <- cumsum(df$dt)
p <- ggplot(df,aes(s,pwr)) + geom_line()
ggsave("pwr.png",p)
dt pwr s
10 0.9955972 323.3430 0.9955972
7 2.5057756 295.2261 3.5013728
2 0.5074293 293.4071 4.0088021
15 1.1912498 285.6561 5.2000519
8 3.3259203 281.7460 8.5259722
13 1.4008969 266.2108 9.9268691
1 4.2681574 265.4673 14.1950265
12 0.1884451 258.5368 14.3834716
6 2.0126561 247.0550 16.3961277
11 4.3295127 242.8312 20.7256404
5 1.9477712 237.8359 22.6734115
4 1.1545416 213.1518 23.8279531
3 0.9062592 191.6465 24.7342123
9 0.8966972 184.8294 25.6309095
14 0.3863399 183.8604 26.0172494
The graph tells that I was able to maintain about 270W or more for 10s or about 240W or more for 20s.
The problem with that approach is: assuming I maintain 10min of 250W or more, drop below 250W and maintain 5min of 250W or more again this will be summed to 15min of maintaining 250W or more - but I only maintained only 250W or more continuously for 10min. So for any given power I need to find the maximum length of time I was able to maintain it continuously rather the total amount of time I maintained 250W or more.
There are more efficient calculations that could be done with base rle
or data.table
, but this tidyverse approach is legible to me and should be adequate for data that isn't extremely large. In my testing, the code below for 100,000 observations took about 5 seconds to run.
My approach is to make a function that finds the cumulative min for any given window size (using slider::slide_index so that the window is defined by time, not a fixed number of observations), and then output the max of these. Then I iteratively feed a range of window sizes into purrr::map_dbl
to get the max_min for that size.
set.seed(0)
# This gets too sluggish for my taste for 1M+ observations, but
# seems fine for me for 100k.
df <- data.frame(dt = runif(1E5,min = 0.01, 5), pwr = rnorm(1E5,250,50))
library(tidyverse); library(slider)
select_max_min <- function(df = df, window_size) {
df %>%
mutate(time = cumsum(dt)) %>%
mutate(cuml_min = slide_index_dbl(pwr, time, min, .before = window_size)) %>%
summarize(max_min = max(cuml_min)) %>%
pull(max_min)
}
data.frame(window = seq(1, 30, length.out = 100)) %>%
mutate(max_min = map_dbl(window, ~select_max_min(df, .x))) %>%
ggplot(aes(window, max_min)) +
geom_line()
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.