简体   繁体   中英

How do I create boxplots in ggplot2 by subsetting a column into ranges to get the X values?

I'm trying to create boxplots by subsetting data in one column to act as x and then using a second column for the y values.

Spot Vol Spot_Tot_Int Spot_Max_Int Spot_Background Spot_Int/Bkg Spot_IntMax/Bkg Spot_Int-Bkg Spot_Z_Pos Spot_X_Pos Spot_Y_Pos
1       47        14757          488        47.58763     310.1016       10.254766     12520.38          4         27         79
2       46        13197          409        46.24423     285.3761        8.844346     11069.77          4         49        936
3       47        17838          573        66.40580     268.6211        8.628765     14716.93          4         63        844
4       38        12484          527        57.01034     218.9778        9.243938     10317.61          4        125        942
5       45        15113          604        43.97189     343.6969       13.736049     13134.27          4        134        891
6       40        13684          578        52.34335     261.4277       11.042473     11590.27          4        204        434

I'm trying to use Spot_Z_Pos as X, but break it down into 3 ranges (1-10, 11-20, 21-30) instead of having a plot for each individual value 1-30. I would like the y value to be Spot_IntMax/Bkg. I can figure out how to do it in basic R by creating three separate data frames of the subsets, but a similar approach isn't helping me in ggplot.

Thanks for your help!!

Hi you could create a new variable for the group and then facet the plot by that. It would be something along the lines of this for a bar plot (you can change to a box plot in the geom layer):

library(dplyr)
library(ggplot2)

df %>%
  dplyr::mutate(GROUP = case_when(Spot_Z_Pos < 11 ~ 1,
                                  Spot_Z_Pos < 21 ~ 2,
                                  Spot_Z_Pos < 31 ~ 3,
                                  TRUE ~ 4)) %>%
  ggplot2::ggplot(aes(Spot_Z_Pos, `Spot_IntMax/Bkg`)) +
  ggplot2::geom_col() +
  ggplot2::facet_wrap( ~ GROUP)

Note that I created group 4 for everyting that is not smaller than 31 just in case you have something unexpected in the column. Also note that there are more compact functions to cut into groups... I just personally prefer the case_when if the number of bins is small

You can also filter for the specific group before building the plot and omit the facet_wrap line - this will result only one plot for one group

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM