简体   繁体   English

如何通过将列子设置为范围以获取 X 值来在 ggplot2 中创建箱线图?

[英]How do I create boxplots in ggplot2 by subsetting a column into ranges to get the X values?

I'm trying to create boxplots by subsetting data in one column to act as x and then using a second column for the y values.我试图通过将一列中的数据子集化为 x 然后使用第二列作为 y 值来创建箱线图。

Spot Vol Spot_Tot_Int Spot_Max_Int Spot_Background Spot_Int/Bkg Spot_IntMax/Bkg Spot_Int-Bkg Spot_Z_Pos Spot_X_Pos Spot_Y_Pos
1       47        14757          488        47.58763     310.1016       10.254766     12520.38          4         27         79
2       46        13197          409        46.24423     285.3761        8.844346     11069.77          4         49        936
3       47        17838          573        66.40580     268.6211        8.628765     14716.93          4         63        844
4       38        12484          527        57.01034     218.9778        9.243938     10317.61          4        125        942
5       45        15113          604        43.97189     343.6969       13.736049     13134.27          4        134        891
6       40        13684          578        52.34335     261.4277       11.042473     11590.27          4        204        434

I'm trying to use Spot_Z_Pos as X, but break it down into 3 ranges (1-10, 11-20, 21-30) instead of having a plot for each individual value 1-30.我试图将 Spot_Z_Pos 用作 X,但将其分解为 3 个范围(1-10、11-20、21-30),而不是为每个单独的值 1-30 绘制一个图。 I would like the y value to be Spot_IntMax/Bkg.我希望 y 值为 Spot_IntMax/Bkg。 I can figure out how to do it in basic R by creating three separate data frames of the subsets, but a similar approach isn't helping me in ggplot.我可以通过创建三个单独的子集数据框来弄清楚如何在基本 R 中做到这一点,但类似的方法在 ggplot 中对我没有帮助。

Thanks for your help!!谢谢你的帮助!!

Hi you could create a new variable for the group and then facet the plot by that.嗨,您可以为该组创建一个新变量,然后通过该变量对绘图进行分面。 It would be something along the lines of this for a bar plot (you can change to a box plot in the geom layer):对于条形图,这与此类似(您可以更改为 geom 层中的箱形图):

library(dplyr)
library(ggplot2)

df %>%
  dplyr::mutate(GROUP = case_when(Spot_Z_Pos < 11 ~ 1,
                                  Spot_Z_Pos < 21 ~ 2,
                                  Spot_Z_Pos < 31 ~ 3,
                                  TRUE ~ 4)) %>%
  ggplot2::ggplot(aes(Spot_Z_Pos, `Spot_IntMax/Bkg`)) +
  ggplot2::geom_col() +
  ggplot2::facet_wrap( ~ GROUP)

Note that I created group 4 for everyting that is not smaller than 31 just in case you have something unexpected in the column.请注意,我为每个不小于 31 的内容创建了第 4 组,以防万一列中出现意外情况。 Also note that there are more compact functions to cut into groups... I just personally prefer the case_when if the number of bins is small还要注意,有更紧凑的功能可以分割成组......我个人更喜欢 case_when 如果 bin 的数量很少

You can also filter for the specific group before building the plot and omit the facet_wrap line - this will result only one plot for one group您还可以在构建绘图之前过滤特定组并省略 facet_wrap 行 - 这将只导致一组绘图

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM