简体   繁体   English

将计数添加到 Plotly 箱线图

[英]Adding counts to Plotly boxplots

I have a relatively simple issue, but cannot find any answer online that addresses it.我有一个相对简单的问题,但在网上找不到任何解决该问题的答案。 Starting from a simple boxplot:从一个简单的箱线图开始:

import plotly.express as px
 
df = px.data.iris()

fig = px.box(
    df, x='species', y='sepal_length'
)

val_counts = df['species'].value_counts()

I would now like to add val_counts (in this dataset, 50 for each species) to the plots, preferably on either of the following places:我现在想将val_counts (在此数据集中,每个物种 50 个)添加到图中,最好是在以下任一位置:

  • On top of the median line在中线之上
  • On top of the max/min line在最大/最小线的顶部
  • Inside the hoverbox悬浮盒内部

How can I achieve this?我怎样才能做到这一点?

Using same approach that I presented in this answer: Change Plotly Boxplot Hover Data使用我在这个答案中提出的相同方法: Change Plotly Boxplot Hover Data

  • calculate all the measures a box plot calculates plus the additional measure you want count计算箱线图计算的所有度量加上您想要计数的附加度量
  • overlay bar traces over box plot traces so hover has all measures required将条形图覆盖在箱线图上,因此悬停具有所需的所有测量值
import plotly.express as px

df = px.data.iris()

# summarize data as per same dimensions as boxplot
df2 = df.groupby("species").agg(
    **{
        m
        if isinstance(m, str)
        else m[0]: ("sepal_length", m if isinstance(m, str) else m[1])
        for m in [
            "max",
            ("q75", lambda s: s.quantile(0.75)),
            "median",
            ("q25", lambda s: s.quantile(0.25)),
            "min",
            "count",
        ]
    }
).reset_index().assign(y=lambda d: d["max"] - d["min"])

# overlay bar over boxplot
px.bar(
    df2,
    x="species",
    y="y",
    base="min",
    hover_data={c:not c in ["y","species"] for c in df2.columns},
    hover_name="species",
).update_traces(opacity=0.1).add_traces(px.box(df, x="species", y="sepal_length").data)

在此处输入图片说明

The snippet below will set count = 50 for all unique values of df['species'] on top of the max line using fig.add_annotation like this:下面的代码段将使用fig.add_annotation为 max 行顶部的df['species']所有唯一值设置count = 50 ,如下所示:

for s in df.species.unique():
    fig.add_annotation(x=s,
                       y = df[df['species']==s]['sepal_length'].max(),
                       text = str(len(df[df['species']==s]['species'])),
                       yshift = 10,
                       showarrow = False
                      )

Plot:阴谋:

在此处输入图片说明

Complete code:完整代码:

import plotly.express as px
 
df = px.data.iris()

fig = px.box(
    df, x='species', y='sepal_length'
)

for s in df.species.unique():
    fig.add_annotation(x=s,
                       y = df[df['species']==s]['sepal_length'].max(),
                       text = str(len(df[df['species']==s]['species'])),
                       yshift = 10,
                       showarrow = False
                      )
f = fig.full_figure_for_development(warn=False)
fig.show()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM