简体   繁体   English

如何使用 Python(或 R)在条形图中进一步分组的条形图中的 plot 堆叠条形图

[英]How to plot stacked bars within grouped bars within further grouped bars in a bar-chart using Python (or R)

I have the following Pandas df I would like to plot:我有以下 Pandas df 我想 plot:

    Segment length     Parameter  Parameter value  Train score  Test score
0               16  n_estimators              5.0     0.975414    0.807823
1               16  n_estimators             10.0     0.982342    0.756803
2               16  n_estimators             15.0     1.000000    0.801020
3               16     max_depth              2.0     0.580884    0.284014
4               16     max_depth              6.0     1.000000    0.824830
5               16     max_depth             10.0     1.000000    0.824830
6               16  max_features              0.1     1.000000    0.845238
7               16  max_features              0.3     1.000000    0.845238
8               16  max_features              0.5     1.000000    0.845238
9               32  n_estimators              5.0     0.961905    0.714286
10              32  n_estimators             10.0     0.988095    0.857143
11              32  n_estimators             15.0     1.000000    0.857143
12              32     max_depth              2.0     0.785714    0.571429
13              32     max_depth              6.0     1.000000    0.857143
14              32     max_depth             10.0     1.000000    0.857143
15              32  max_features              0.1     1.000000    0.904762
16              32  max_features              0.3     1.000000    0.904762
17              32  max_features              0.5     1.000000    0.857143

The plot I imagine is a grouped bar-chart containing groups by 'segment length', containing further groups by 'parameter', containing further groups by 'value', containing two bars of 'train score' and 'test score' (either side-by-side or stacked)... Now that's a handful, but it works on paper.我想象的 plot 是一个分组条形图,包含按“段长度”分组,按“参数”包含更多组,按“值”包含更多组,包含“训练分数”和“测试分数”的两个条形图(任一侧- 并排或堆叠)... 现在那是少数,但它在纸上有效。 I've been trying to get this to work in Matplotlib (or R) all day without success.我一直试图让它在 Matplotlib(或 R)中工作一整天,但没有成功。 Does anybody have a suggestion on how to get this to work?有人对如何让它工作有建议吗?

(NB in the above dataframe I have two 'Segment length' groups, and only three 'Parameter value' groups per parameter; eventually this will be 6 groups and 10 or so groups each respectfully.) (请注意,在上面的 dataframe 中,我有两个“段长度”组,每个参数只有三个“参数值”组;最终这将是 6 组和 10 个左右的组。)

Here is a suggestion using R: We can switch the grouping dynamics: eg fill and faceting.这是使用 R 的建议:我们可以切换分组动态:例如填充和分面。

What we do here:我们在这里做什么:

  1. Bring Score in long format以长格式带来分数
  2. Group and calculate the mean and sd分组并计算均值和标准差
  3. plot with ggplot plot 与 ggplot
library(tidyverse) 
library(ggsci)
df %>% 
  pivot_longer(ends_with("score")) %>% 
  group_by(name, Segment_length, Parameter) %>% 
  summarise(mean_value = mean(value), sd_value = sd(value)) %>% 
  ggplot(aes(x= name, y = mean_value, fill=factor(Segment_length)))+
  geom_bar(stat="identity",position="dodge")+
  facet_wrap(. ~ Parameter)+
  geom_errorbar(mapping=aes(ymin=mean_value-sd_value,ymax=mean_value+sd_value),
                width=0.2,position=position_dodge(width=0.9))+
  theme_classic()+
  scale_fill_nejm() +
  labs(x="Test/Train", y="Score", fill="Segment Length") 

在此处输入图像描述

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM