简体   繁体   中英

R method of summarising vector up to certain quantile(s)

I have data relating to 36 regions of interest (ROI), approx. 380 pixels per ROI. My data is like:

      ROI_name    T_K
1   bt_full_05 303.88
1.1 bt_full_05 303.93
1.2 bt_full_05 303.72
1.3 bt_full_05 303.43
1.4 bt_full_05 302.93
1.5 bt_full_05 302.93
...
36.362 bt_full_40 301.65
36.363 bt_full_40 301.47
36.364 bt_full_40 301.52
36.365 bt_full_40 302.02
36.366 bt_full_40 303.28
36.367 bt_full_40 303.78

I want to compute mean T_K for each ROI, but filter out values below a given quantile, eg 0.25, and output the mean of values up to that quantile. Ideally I could report mean T_K for several quantiles, 0.1, 0.25, 0.5... I have:

groupquant <- cleared_data %>% group_by(ROI_name) %>% 
  summarise(quants = quantile(T_K, 0.1))

which gives me the quantiles. But this

groupquant <- cleared_data %>% group_by(ROI_name) %>% 
  filter(cleared_data$T_K <= quantile(T_K, 0.1)) #%>% 

throws

Error: Result must have length 392, not 14082

I'm getting nowhere! Cheers, Andrew.

I think the sample data is a bit small to demonstrate that you wanna do. So I created my own data, which is called foo . For each ROI_name , I removed some data with filter() . All values which are smaller than quantile(T_K, 0.25) are removed. Then, I decided to take values for two quantile points (ie, 0.5 and 0.75). In the summarize() part, I am getting a vector with two numeric values and creating a data frame for each group. Finally, I used unnest() to create the final output.

library(tidyverse)

set.seed(111)

foo <- tibble(ROI_name = rep(c("bt_full_05", "bt_full_40", "bt_full_2"), each = 30),
              T_K = runif(n = 90, min = 0, max = 300))

group_by(foo, ROI_name) %>% 
filter(T_K > quantile(T_K, 0.25)) %>% 
summarize(temp = list(enframe(quantile(x = T_K, prob = c(0.5, 0.75)),
                              name = "percentile"))) %>% 
unnest(temp)

  ROI_name   percentile value
  <chr>      <chr>      <dbl>
1 bt_full_05 50%         157.
2 bt_full_05 75%         183.
3 bt_full_2  50%         157.
4 bt_full_2  75%         229.
5 bt_full_40 50%         192.
6 bt_full_40 75%         237.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM