简体   繁体   English

计算R中多列的百分位数

[英]calculating percentiles for multiple columns in R

I need to calculate the quantiles at following probability values 0.05,0.25,0.50,0.75,0.90,0.95,0.99,1 for 100 variables excluding time 我需要针对100个变量(不包括时间)以以下概率值0.05、0.25、0.50、0.75、0.90、0.95、0.99,1计算分位数

Data structure is as below 数据结构如下

datasetname-df 数据集名称-DF

time Var1 var2 var3.....var100

 1    100   230  378......300

 2    200  145  129......240

 3    150  235  200 .... 690

I am using the below logic. 我正在使用以下逻辑。

percentiles <- do.call("rbind",tapply(df[2:100],quantile,probs=c(0,0.05,0.25,0.50,0.75,0.90,0.95,0.99,1),na.rm=TRUE))

Since this runs only on vectors, it would be difficult to call all 100 variables. 由于这仅在向量上运行,因此很难调用所有100个变量。

Why use tapply? 为什么要使用tapply? Just using apply seems fine here, eg: 仅在此处使用apply看起来不错,例如:

quants <- c(0,0.05,0.25,0.50,0.75,0.90,0.95,0.99,1)
apply( df[2:100] , 2 , quantile , probs = quants , na.rm = TRUE )

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM