[英]R sample a row of a dataframe based on a percentile value
I have a dataframe df
with 500 rows.我有一个包含 500 行的 dataframe
df
。 I would like to sample the row of df
based on a given percentile value.我想根据给定的百分位值对
df
的行进行采样。
For example, if the percentile value is 0.73 I would like to sample the 0.73 * 500 = 365th
row.例如,如果百分位数值为 0.73,我想对第
0.73 * 500 = 365th
行进行采样。 If it is 0.7212 I would like to sample the 0.7212*500 = 360.6 = 370th
row (here it has been rounded to the nearest integer value).如果它是 0.7212,我想对
0.7212*500 = 360.6 = 370th
行进行采样(这里它已四舍五入到最接近的 integer 值)。
As an extension of this I would like to perform the same with a vector of percentile values.作为对此的扩展,我想对百分位值的向量执行相同的操作。 For example if the vector is
c(0.73,0.7212)
then I would like to return a dataframe that consists of rows 365 and 370 of the original dataframe df
.例如,如果向量是
c(0.73,0.7212)
那么我想返回一个 dataframe ,它由原始 dataframe df
的第 365 行和第 370 行组成。
What would be the best approach to this?最好的方法是什么?
percentile_index <- function(data, p){
rows <- ceiling(p * nrow(data))
data[rows, ]
}
Example:例子:
dat <- data.frame(col1 = 1:500, col2 = rnorm(500))
percentile_index(dat, c(0.73,0.7212))
# col1 col2
#365 365 0.1910813
#361 361 0.3870956
or simply:或者简单地:
dat[ceiling(c(0.73,0.7212) * nrow(dat)), ]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.