简体   繁体   English

如何在 y 轴不基于计数的 Python 中创建堆叠条形图 plot

[英]How can I create a Stacked Bar plot in Python where the y axis is NOT based on counts

I have the following Pandas DataFrame (abbreviated here):我有以下Pandas DataFrame(这里简称):

df = pd.DataFrame([
("Distal Lung AT2", 0.4269588779192778, 20),
("Lung Ciliated epithelial cells", 0.28642167657082035, 20),
("Distal Lung AT2",0.4488207834077291,15), 
("Lung Ciliated epithelial cells", 0.27546336897259094, 15),
("Distal Lung AT2", 0.45502553604960105, 10),
("Lung Ciliated epithelial cells", 0.29080413886147555, 10),
("Distal Lung AT2", 0.48481604554028446, 5),
("Lung Ciliated epithelial cells", 0.3178232409599174, 5)],
 columns = ["features", "importance", "num_features"])

I'd like to create a stacked bar plot where the x-axis represents the num_features (so rows with the same num_features should be grouped together), the y axis represents importance , and each bar in the bar plot has blocks colored by features我想创建一个堆叠条 plot ,其中 x 轴表示num_features (因此具有相同num_features的行应该组合在一起),y 轴表示importance ,并且条形 plot 中的每个条都有按features着色的块

I tried using plotnine for this, as follows:我为此尝试使用plotnine ,如下所示:

plot = (
        ggplot(df, aes(x="num_features", y="importance", fill="features"))
              + geom_bar(stat="identity")
              + xlab("Number of Features")
              + ylab("")
        )

However, when I try to save the plot so I can view it ggsave(plot, os.path.join(figure_path, "stacked_feature_importances.png")) , I get:但是,当我尝试保存 plot 以便查看它ggsave(plot, os.path.join(figure_path, "stacked_feature_importances.png"))时,我得到:

Traceback (most recent call last):
  File "/home/mdanb/plot_top_features_iteratively.py", line 94, in <module>
    plot_stacked_bar_plots(backwards_elim_dirs)
  File "/home/mdanb/plot_top_features_iteratively.py", line 87, in plot_stacked_bar_plots
    ggsave(plot, os.path.join(figure_path, "stacked_feature_importances.png"))
  File "/home/mdanb/.local/lib/python3.8/site-packages/plotnine/ggplot.py", line 736, in ggsave
    return plot.save(*arg, **kwargs)
  File "/home/mdanb/.local/lib/python3.8/site-packages/plotnine/ggplot.py", line 724, in save
    fig, p = self.draw(return_ggplot=True)
  File "/home/mdanb/.local/lib/python3.8/site-packages/plotnine/ggplot.py", line 203, in draw
    self._build()
  File "/home/mdanb/.local/lib/python3.8/site-packages/plotnine/ggplot.py", line 311, in _build
    layers.compute_position(layout)
  File "/home/mdanb/.local/lib/python3.8/site-packages/plotnine/layer.py", line 79, in compute_position
    l.compute_position(layout)
  File "/home/mdanb/.local/lib/python3.8/site-packages/plotnine/layer.py", line 393, in compute_position
    data = self.position.compute_layer(data, params, layout)
  File "/home/mdanb/.local/lib/python3.8/site-packages/plotnine/positions/position.py", line 56, in compute_layer
    return groupby_apply(data, 'PANEL', fn)
  File "/home/mdanb/.local/lib/python3.8/site-packages/plotnine/utils.py", line 638, in groupby_apply
    lst.append(func(d, *args, **kwargs))
  File "/home/mdanb/.local/lib/python3.8/site-packages/plotnine/positions/position.py", line 54, in fn
    return cls.compute_panel(pdata, scales, params)
  File "/home/mdanb/.local/lib/python3.8/site-packages/plotnine/positions/position_stack.py", line 85, in compute_panel
    trans = scales.y.trans
AttributeError: 'scale_y_discrete' object has no attribute 'trans'

I also looked into trying directly to use Pandas without plotnine , based on this post.根据这篇文章,我还研究了直接使用Pandas而不使用plotnine However, it doesn't quite address my issue because the bar plot is stacked based on counts, whereas I specifically want to stack it based on values of a column ( importance )但是,它并没有完全解决我的问题,因为条形 plot 是根据计数堆叠的,而我特别想根据列的值堆叠它( importance

The problem is you are using geom_bar , which doesn't expect a y aesthetic, it automatically computes the counts for you based on the x aesthetic you specify.问题是您正在使用geom_bar ,它不期望y美学,它会根据您指定的x美学自动为您计算计数。

If you want to specify manually the y , you should use geom_col , which expects both an x and y aesthetic.如果要手动指定y ,则应使用geom_col ,它需要xy美学。 The default behaviour if you include a fill aesthetic will be to stack the columns, which you could change by specifying position='dodge' .如果您包含fill美学,则默认行为将是堆叠列,您可以通过指定position='dodge'进行更改。

Using your example:使用您的示例:

import plotnine as p9

(p9.ggplot(df)
 + p9.aes(x='num_features', y='importance', fill='features')
 + p9.geom_col())

Output Output

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何在其中一个图上具有一个共享的X轴和多个Y轴的堆叠图? - How can I have a stacked plot with a shared X axis and multiple Y axis on one of the plots? 如何创建 y 轴上的值列表,而不必在 Python 中创建 plot 图形? - How can I create a list of the values on the y-axis without having to plot a graph in Python? 如何在 Python 图中以弧度设置 y 轴? - How can I set the y axis in radians in a Python plot? 如何在 Python 中更改直方图 Y 轴上的值 - How can I change the values on Y axis of Histogram plot in Python 如何在python中创建堆叠线图? - How can I create a plot of stacked lines in python? 如何创建 Y 轴不明确的堆叠条形图/猫图? - How do I create stacked barplots / catplots with ambigious Y axis? 如何在 matplotlib 中创建堆叠条形图,其中堆栈因条形而异? - How can I create a stacked bar chart in matplotlib where the stacks vary from bar to bar? Python 堆叠条形图,其中 y 轴刻度是线性的,但条形填充是 10 秒的对数 - Python stacked barchart where y-axis scale is linear but the bar fill is logarithmic in the order of 10s 如何在 python 中以像素数为 x 轴,灰度颜色为 y 轴 plot 图形? - How can I plot the figure with the number of pixel as a x-axis and the grayscale color as a y-axis in python? Python 条形图 plot y轴显示百分比 - Python Bar plot y axis display percentage
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM