簡體   English   中英

如何在 y 軸不基於計數的 Python 中創建堆疊條形圖 plot

[英]How can I create a Stacked Bar plot in Python where the y axis is NOT based on counts

我有以下Pandas DataFrame(這里簡稱):

df = pd.DataFrame([
("Distal Lung AT2", 0.4269588779192778, 20),
("Lung Ciliated epithelial cells", 0.28642167657082035, 20),
("Distal Lung AT2",0.4488207834077291,15), 
("Lung Ciliated epithelial cells", 0.27546336897259094, 15),
("Distal Lung AT2", 0.45502553604960105, 10),
("Lung Ciliated epithelial cells", 0.29080413886147555, 10),
("Distal Lung AT2", 0.48481604554028446, 5),
("Lung Ciliated epithelial cells", 0.3178232409599174, 5)],
 columns = ["features", "importance", "num_features"])

我想創建一個堆疊條 plot ,其中 x 軸表示num_features (因此具有相同num_features的行應該組合在一起),y 軸表示importance ,並且條形 plot 中的每個條都有按features着色的塊

我為此嘗試使用plotnine ,如下所示:

plot = (
        ggplot(df, aes(x="num_features", y="importance", fill="features"))
              + geom_bar(stat="identity")
              + xlab("Number of Features")
              + ylab("")
        )

但是,當我嘗試保存 plot 以便查看它ggsave(plot, os.path.join(figure_path, "stacked_feature_importances.png"))時,我得到:

Traceback (most recent call last):
  File "/home/mdanb/plot_top_features_iteratively.py", line 94, in <module>
    plot_stacked_bar_plots(backwards_elim_dirs)
  File "/home/mdanb/plot_top_features_iteratively.py", line 87, in plot_stacked_bar_plots
    ggsave(plot, os.path.join(figure_path, "stacked_feature_importances.png"))
  File "/home/mdanb/.local/lib/python3.8/site-packages/plotnine/ggplot.py", line 736, in ggsave
    return plot.save(*arg, **kwargs)
  File "/home/mdanb/.local/lib/python3.8/site-packages/plotnine/ggplot.py", line 724, in save
    fig, p = self.draw(return_ggplot=True)
  File "/home/mdanb/.local/lib/python3.8/site-packages/plotnine/ggplot.py", line 203, in draw
    self._build()
  File "/home/mdanb/.local/lib/python3.8/site-packages/plotnine/ggplot.py", line 311, in _build
    layers.compute_position(layout)
  File "/home/mdanb/.local/lib/python3.8/site-packages/plotnine/layer.py", line 79, in compute_position
    l.compute_position(layout)
  File "/home/mdanb/.local/lib/python3.8/site-packages/plotnine/layer.py", line 393, in compute_position
    data = self.position.compute_layer(data, params, layout)
  File "/home/mdanb/.local/lib/python3.8/site-packages/plotnine/positions/position.py", line 56, in compute_layer
    return groupby_apply(data, 'PANEL', fn)
  File "/home/mdanb/.local/lib/python3.8/site-packages/plotnine/utils.py", line 638, in groupby_apply
    lst.append(func(d, *args, **kwargs))
  File "/home/mdanb/.local/lib/python3.8/site-packages/plotnine/positions/position.py", line 54, in fn
    return cls.compute_panel(pdata, scales, params)
  File "/home/mdanb/.local/lib/python3.8/site-packages/plotnine/positions/position_stack.py", line 85, in compute_panel
    trans = scales.y.trans
AttributeError: 'scale_y_discrete' object has no attribute 'trans'

根據這篇文章,我還研究了直接使用Pandas而不使用plotnine 但是,它並沒有完全解決我的問題,因為條形 plot 是根據計數堆疊的,而我特別想根據列的值堆疊它( importance

問題是您正在使用geom_bar ,它不期望y美學,它會根據您指定的x美學自動為您計算計數。

如果要手動指定y ,則應使用geom_col ,它需要xy美學。 如果您包含fill美學,則默認行為將是堆疊列,您可以通過指定position='dodge'進行更改。

使用您的示例:

import plotnine as p9

(p9.ggplot(df)
 + p9.aes(x='num_features', y='importance', fill='features')
 + p9.geom_col())

Output

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM