I am trying to undertake some simple benchmarking in R. I have a dataframe with several numeric and a number of factors.
What I am trying to do is find the top decile and top quartile of a variable called ALoS based on the associated factor value and then attach these values back to the original dataframe
In excel this would be the equivalent of an array formula similar to: {=percentile(if(Factor_range = Factor, ALoS_range),k)}
You seem to have two questions. As for the first, to compute the quantiles, since you haven't provided us with a dataset, I'll make up one. See if the following answers the question.
set.seed(954)
dat <- data.frame(A = sample(letters[1:3], 20, TRUE), X = rnorm(20))
dat
quantile(dat$X[dat$A == "a"], probs = c(0.75, 0.90))
As for the second question, to attach it back to the data frame, I really don't understand what you mean. Please give us an example of the wanted output.
This is a great time to use the ave
function:
dat$top_q <- ave(dat$X, dat$A, FUN = function(x) quantile(x, .75))
dat$top_d <- ave(dat$X, dat$A, FUN = function(x) quantile(x, .9))
A X top_q top_d
1 a 1.7150650 1.346828 1.5677700
2 b 0.4609162 0.390532 0.4308438
3 a -1.2650612 1.346828 1.5677700
4 b -0.6868529 0.390532 0.4308438
5 b -0.4456620 0.390532 0.4308438
6 a 1.2240818 1.346828 1.5677700
7 b 0.3598138 0.390532 0.4308438
8 b 0.4007715 0.390532 0.4308438
9 b 0.1106827 0.390532 0.4308438
10 a -0.5558411 1.346828 1.5677700
set.seed(123)
dat <- data.frame(A = sample(letters[1:2], 10, TRUE), X = rnorm(10))
A X
1 a 1.7150650
2 b 0.4609162
3 a -1.2650612
4 b -0.6868529
5 b -0.4456620
6 a 1.2240818
7 b 0.3598138
8 b 0.4007715
9 b 0.1106827
10 a -0.5558411
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.