简体   繁体   中英

How can I create a Stacked Bar plot in Python where the y axis is NOT based on counts

I have the following Pandas DataFrame (abbreviated here):

df = pd.DataFrame([
("Distal Lung AT2", 0.4269588779192778, 20),
("Lung Ciliated epithelial cells", 0.28642167657082035, 20),
("Distal Lung AT2",0.4488207834077291,15), 
("Lung Ciliated epithelial cells", 0.27546336897259094, 15),
("Distal Lung AT2", 0.45502553604960105, 10),
("Lung Ciliated epithelial cells", 0.29080413886147555, 10),
("Distal Lung AT2", 0.48481604554028446, 5),
("Lung Ciliated epithelial cells", 0.3178232409599174, 5)],
 columns = ["features", "importance", "num_features"])

I'd like to create a stacked bar plot where the x-axis represents the num_features (so rows with the same num_features should be grouped together), the y axis represents importance , and each bar in the bar plot has blocks colored by features

I tried using plotnine for this, as follows:

plot = (
        ggplot(df, aes(x="num_features", y="importance", fill="features"))
              + geom_bar(stat="identity")
              + xlab("Number of Features")
              + ylab("")
        )

However, when I try to save the plot so I can view it ggsave(plot, os.path.join(figure_path, "stacked_feature_importances.png")) , I get:

Traceback (most recent call last):
  File "/home/mdanb/plot_top_features_iteratively.py", line 94, in <module>
    plot_stacked_bar_plots(backwards_elim_dirs)
  File "/home/mdanb/plot_top_features_iteratively.py", line 87, in plot_stacked_bar_plots
    ggsave(plot, os.path.join(figure_path, "stacked_feature_importances.png"))
  File "/home/mdanb/.local/lib/python3.8/site-packages/plotnine/ggplot.py", line 736, in ggsave
    return plot.save(*arg, **kwargs)
  File "/home/mdanb/.local/lib/python3.8/site-packages/plotnine/ggplot.py", line 724, in save
    fig, p = self.draw(return_ggplot=True)
  File "/home/mdanb/.local/lib/python3.8/site-packages/plotnine/ggplot.py", line 203, in draw
    self._build()
  File "/home/mdanb/.local/lib/python3.8/site-packages/plotnine/ggplot.py", line 311, in _build
    layers.compute_position(layout)
  File "/home/mdanb/.local/lib/python3.8/site-packages/plotnine/layer.py", line 79, in compute_position
    l.compute_position(layout)
  File "/home/mdanb/.local/lib/python3.8/site-packages/plotnine/layer.py", line 393, in compute_position
    data = self.position.compute_layer(data, params, layout)
  File "/home/mdanb/.local/lib/python3.8/site-packages/plotnine/positions/position.py", line 56, in compute_layer
    return groupby_apply(data, 'PANEL', fn)
  File "/home/mdanb/.local/lib/python3.8/site-packages/plotnine/utils.py", line 638, in groupby_apply
    lst.append(func(d, *args, **kwargs))
  File "/home/mdanb/.local/lib/python3.8/site-packages/plotnine/positions/position.py", line 54, in fn
    return cls.compute_panel(pdata, scales, params)
  File "/home/mdanb/.local/lib/python3.8/site-packages/plotnine/positions/position_stack.py", line 85, in compute_panel
    trans = scales.y.trans
AttributeError: 'scale_y_discrete' object has no attribute 'trans'

I also looked into trying directly to use Pandas without plotnine , based on this post. However, it doesn't quite address my issue because the bar plot is stacked based on counts, whereas I specifically want to stack it based on values of a column ( importance )

The problem is you are using geom_bar , which doesn't expect a y aesthetic, it automatically computes the counts for you based on the x aesthetic you specify.

If you want to specify manually the y , you should use geom_col , which expects both an x and y aesthetic. The default behaviour if you include a fill aesthetic will be to stack the columns, which you could change by specifying position='dodge' .

Using your example:

import plotnine as p9

(p9.ggplot(df)
 + p9.aes(x='num_features', y='importance', fill='features')
 + p9.geom_col())

Output

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM