Cycling Power Envelope in R

Question

Cyclists are recording pedalling power over time and analyse a curve 1 derived from this that plots for every given time interval the amount of power at least produced: for 20 continuous minutes, you maintained 248W or more. I'd like to compute this in R starting from small time intervals dt and corresponding power pwr :

df <- data.frame(dt = rnorm(15,2,1), pwr = rnorm(15,250,50))

A simple but inaccurate way is this:

library(ggplot2)

df <- data.frame(dt = rnorm(15,2,1), pwr = rnorm(15,250,50))
df <- df[with(df, order(-pwr)),]
df$s <- cumsum(df$dt)

p <- ggplot(df,aes(s,pwr)) + geom_line()
ggsave("pwr.png",p)

          dt      pwr          s
10 0.9955972 323.3430  0.9955972
7  2.5057756 295.2261  3.5013728
2  0.5074293 293.4071  4.0088021
15 1.1912498 285.6561  5.2000519
8  3.3259203 281.7460  8.5259722
13 1.4008969 266.2108  9.9268691
1  4.2681574 265.4673 14.1950265
12 0.1884451 258.5368 14.3834716
6  2.0126561 247.0550 16.3961277
11 4.3295127 242.8312 20.7256404
5  1.9477712 237.8359 22.6734115
4  1.1545416 213.1518 23.8279531
3  0.9062592 191.6465 24.7342123
9  0.8966972 184.8294 25.6309095
14 0.3863399 183.8604 26.0172494

The graph tells that I was able to maintain about 270W or more for 10s or about 240W or more for 20s.

The problem with that approach is: assuming I maintain 10min of 250W or more, drop below 250W and maintain 5min of 250W or more again this will be summed to 15min of maintaining 250W or more - but I only maintained only 250W or more continuously for 10min. So for any given power I need to find the maximum length of time I was able to maintain it continuously rather the total amount of time I maintained 250W or more.

Is the desired summary a well known function? It is defined for any continuous function and similar to a probability density function. It is always a strictly decreasing function.
How can this be done in R accurately and efficiently?

Answer 1

There are more efficient calculations that could be done with base rle or data.table , but this tidyverse approach is legible to me and should be adequate for data that isn't extremely large. In my testing, the code below for 100,000 observations took about 5 seconds to run.

My approach is to make a function that finds the cumulative min for any given window size (using slider::slide_index so that the window is defined by time, not a fixed number of observations), and then output the max of these. Then I iteratively feed a range of window sizes into purrr::map_dbl to get the max_min for that size.

set.seed(0)
# This gets too sluggish for my taste for 1M+ observations, but 
#  seems fine for me for 100k.
df <- data.frame(dt = runif(1E5,min = 0.01, 5), pwr = rnorm(1E5,250,50))

library(tidyverse); library(slider)
select_max_min <- function(df = df, window_size) {
  df %>%
    mutate(time = cumsum(dt)) %>%
    mutate(cuml_min = slide_index_dbl(pwr, time, min, .before = window_size)) %>%
    summarize(max_min = max(cuml_min)) %>%
    pull(max_min)
}

data.frame(window = seq(1, 30, length.out = 100)) %>%
  mutate(max_min = map_dbl(window, ~select_max_min(df, .x))) %>%
  ggplot(aes(window, max_min)) +
    geom_line()

Cycling Power Envelope in R

Question

1 answers

solution1
1 ACCPTED 2021-03-22 20:41:25

Cycling Power Envelope in R

Question

1 answers

solution1 1 ACCPTED 2021-03-22 20:41:25

solution1
1 ACCPTED 2021-03-22 20:41:25