简体   繁体   中英

Summarize to quantiles in using dplyr?

Suppose one is manipulating a dataframe in dplyr , and one would like to summarize one's data into a table with a column for each decile. Setting aside the question of why one would do this, there remains the question of how.

It has been noted before that summarize does not like vector-valued functions. As mentioned in that post, the most literal-minded way of doing it is simply to create an explicit column for each decile:

df <- data.frame(value=rnorm(1000)) %>%
    summarize(`0.1` = quantile(value, 0.1),
              `0.2` = quantile(value, 0.2), 
              `0.3` = quantile(value, 0.3),
              ...)

This, obviously, is vile. Yet it is not immediately obvious to me how to use ddply nor do , as mentioned in the linked question, to accomplish this goal. And it just feels like there ought to be a "tidy" way to do this, along the lines of:

df <- data.frame(value=rnorm(1000)) %>%
    summarize(quantiles = quantile(value, seq(0.1, 0.9, 0.1))) %>%
    expand_vector_to_columns()

Is there?

This might do it:

df <- data.frame(value=rnorm(1000))  %>%
unlist  %>% 
quantile(seq(.1, .9, .1)) %>% 
matrix(., 1,9, dimnames=list(NULL, names(.)))  %>%
as.data.frame(., col.names=colnames(.))
#df
#     10%     20%     30%     40%    50%    60%    70%    80%   90%
#1 -1.275 -0.8528 -0.5258 -0.2353 0.0303 0.3051 0.5732 0.8918 1.278

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM