简体   繁体   English

R 根据百分位值对 dataframe 的一行进行采样

[英]R sample a row of a dataframe based on a percentile value

I have a dataframe df with 500 rows.我有一个包含 500 行的 dataframe df I would like to sample the row of df based on a given percentile value.我想根据给定的百分位值对df的行进行采样。

For example, if the percentile value is 0.73 I would like to sample the 0.73 * 500 = 365th row.例如,如果百分位数值为 0.73,我想对第0.73 * 500 = 365th行进行采样。 If it is 0.7212 I would like to sample the 0.7212*500 = 360.6 = 370th row (here it has been rounded to the nearest integer value).如果它是 0.7212,我想对0.7212*500 = 360.6 = 370th行进行采样(这里它已四舍五入到最接近的 integer 值)。

As an extension of this I would like to perform the same with a vector of percentile values.作为对此的扩展,我想对百分位值的向量执行相同的操作。 For example if the vector is c(0.73,0.7212) then I would like to return a dataframe that consists of rows 365 and 370 of the original dataframe df .例如,如果向量是c(0.73,0.7212)那么我想返回一个 dataframe ,它由原始 dataframe df的第 365 行和第 370 行组成。

What would be the best approach to this?最好的方法是什么?

percentile_index <- function(data, p){
  rows <- ceiling(p * nrow(data))
  data[rows, ]
}

Example:例子:

dat <- data.frame(col1 = 1:500, col2 = rnorm(500))
percentile_index(dat, c(0.73,0.7212))
#    col1      col2
#365  365 0.1910813
#361  361 0.3870956

or simply:或者简单地:

dat[ceiling(c(0.73,0.7212) * nrow(dat)), ]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM