简体   繁体   中英

R sample a row of a dataframe based on a percentile value

I have a dataframe df with 500 rows. I would like to sample the row of df based on a given percentile value.

For example, if the percentile value is 0.73 I would like to sample the 0.73 * 500 = 365th row. If it is 0.7212 I would like to sample the 0.7212*500 = 360.6 = 370th row (here it has been rounded to the nearest integer value).

As an extension of this I would like to perform the same with a vector of percentile values. For example if the vector is c(0.73,0.7212) then I would like to return a dataframe that consists of rows 365 and 370 of the original dataframe df .

What would be the best approach to this?

percentile_index <- function(data, p){
  rows <- ceiling(p * nrow(data))
  data[rows, ]
}

Example:

dat <- data.frame(col1 = 1:500, col2 = rnorm(500))
percentile_index(dat, c(0.73,0.7212))
#    col1      col2
#365  365 0.1910813
#361  361 0.3870956

or simply:

dat[ceiling(c(0.73,0.7212) * nrow(dat)), ]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM