R sample a row of a dataframe based on a percentile value

Question

I have a dataframe df with 500 rows. I would like to sample the row of df based on a given percentile value.

For example, if the percentile value is 0.73 I would like to sample the 0.73 * 500 = 365th row. If it is 0.7212 I would like to sample the 0.7212*500 = 360.6 = 370th row (here it has been rounded to the nearest integer value).

As an extension of this I would like to perform the same with a vector of percentile values. For example if the vector is c(0.73,0.7212) then I would like to return a dataframe that consists of rows 365 and 370 of the original dataframe df .

What would be the best approach to this?

Answer 1

percentile_index <- function(data, p){
  rows <- ceiling(p * nrow(data))
  data[rows, ]
}

Example:

dat <- data.frame(col1 = 1:500, col2 = rnorm(500))
percentile_index(dat, c(0.73,0.7212))
#    col1      col2
#365  365 0.1910813
#361  361 0.3870956

or simply:

dat[ceiling(c(0.73,0.7212) * nrow(dat)), ]

R sample a row of a dataframe based on a percentile value

Question

1 answers

solution1
2 ACCPTED 2022-11-29 16:29:38

R sample a row of a dataframe based on a percentile value

Question

1 answers

solution1 2 ACCPTED 2022-11-29 16:29:38

solution1
2 ACCPTED 2022-11-29 16:29:38