简体   繁体   中英

Appending sample number to X-Labels in altair

I would like to automatically append the sample # (in parentheses) corresponding to the x-labels of an altair figure. I am open to doing this outside of altair, but I thought there may be a way to do it at the figure level using altair/vega-lite. I am pasting the code using an example from the altair/vega website (part of the vega_dataset), but with a hackneyed, manual method in which I rename the labels explicitly for one of the labels. In this case, I have added the sample number of 73 to Europe.

Link to data

import altair as alt
from vega_datasets import data

df = data.cars()
df['Origin'] = df['Origin'].replace({'Europe':'Europe (n=73)'})

alt.Chart(df).transform_density(
    'Miles_per_Gallon',
    as_=['Miles_per_Gallon', 'density'],
    extent=[5, 50],
    groupby=['Origin']
).mark_area(orient='horizontal').encode(
    y='Miles_per_Gallon:Q',
    color='Origin:N',
    x=alt.X(
        'density:Q',
        stack='center',
        impute=None,
        title=None,
        axis=alt.Axis(labels=False, values=[0],grid=False, ticks=True),
    ),
    column=alt.Column(
        'Origin:N',
        header=alt.Header(
            titleOrient='bottom',
            labelOrient='bottom',
            labelPadding=0,
        ),
    )
).properties(
    width=100
).configure_facet(
    spacing=0
).configure_view(
    stroke=None
)

在此处输入图像描述

You could use pandas to generate the replacement dictionary and assign it to a new dataframe column:

import altair as alt
from vega_datasets import data

df = data.cars()
group_sizes = df.groupby('Origin').size()
replace_dict = group_sizes.index + ' (n=' + group_sizes.astype(str) + ')'
df['Origin_with_count'] = df['Origin'].replace(replace_dict)

alt.Chart(df).transform_density(
    'Miles_per_Gallon',
    as_=['Miles_per_Gallon', 'density'],
    extent=[5, 50],
    groupby=['Origin_with_count', 'Origin']
).mark_area(orient='horizontal').encode(
    y='Miles_per_Gallon:Q',
    color='Origin:N',
    x=alt.X(
        'density:Q',
        stack='center',
        impute=None,
        title=None,
        axis=alt.Axis(labels=False, values=[0],grid=False, ticks=True),
    ),
    column=alt.Column(
        'Origin_with_count:N',
        header=alt.Header(
            title=None,
            labelOrient='bottom',
            labelPadding=0,
        ),
    )
).properties(
    width=100
).configure_facet(
    spacing=0
).configure_view(
    stroke=None
)

You might be able to do something more elegant with labelExpr , not sure.

在此处输入图像描述

You could overlay a text mark with the count instead.
I was able to do this with the following code. I was not able to manage the y position of the text (see commented-out line) or use the n datum in the header labelExpr for some reason.

df = data.cars()

violin = alt.Chart(df).transform_density(
    'Miles_per_Gallon',
    as_=['Miles_per_Gallon', 'density'],
    extent=[5, 50],
    groupby=['Origin']
).mark_area(orient='horizontal').encode(
    y='Miles_per_Gallon:Q',
    color='Origin:N',
    x=alt.X(
        'density:Q',
        stack='center',
        impute=None,
        title=None,
        axis=alt.Axis(labels=False, values=[0],grid=False, ticks=True),
    ),
).properties(width=100)

text = alt.Chart(df).mark_text().transform_aggregate(
    cnt='count()',
    groupby=["Origin"]
).transform_calculate(
    n = "'n=' + datum.cnt",
).encode(
#     y=alt.Y('mean(Miles_per_Gallon):Q'),
    text=alt.Text('n:N'), 
)

(violin + text).facet(
    column=alt.Column('Origin:N'),
).configure_header(
    labelExpr="[datum.value, datum.n]",
)

在此处输入图像描述

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM