简体   繁体   中英

How to show a histogram of percentages instead of counts using Altair

How do I get a histogram of percentages of total instead of a histogram of count using Altair and Pandas?

I have this at the moment:

值的直方图

Which I got by doing this:

d = {'age': ['12', '32', '43', '54', '32', '32', '12']}
dfTest = pd.DataFrame(data=d)

alt.Chart(dfTest).mark_bar().encode(
    alt.X("age:Q", bin=True),
    y='count()',
)

You can do this with a Join Aggregate transform followed by a Calculate transform :

import pandas as pd
import altair as alt

source = pd.DataFrame({'age': ['12', '32', '43', '54', '32', '32', '12']})

alt.Chart(source).transform_joinaggregate(
    total='count(*)'
).transform_calculate(
    pct='1 / datum.total'
).mark_bar().encode(
    alt.X('age:Q', bin=True),
    alt.Y('sum(pct):Q', axis=alt.Axis(format='%'))
)

在此处输入图片说明


Edit: this was my initial answer, which is much more complicated:

It's not entirely straightforward, because it requires manually specifying the bin and aggregate transforms currently implied by your encoding, followed by a calculate transform to compute the percentages. Here is an example:

import pandas as pd
import altair as alt

source = pd.DataFrame({'age': ['12', '32', '43', '54', '32', '32', '12']})

alt.Chart(source).transform_bin(
    ['age_min', 'age_max'],
    field='age',
).transform_aggregate(
    count='count()',
    groupby=['age_min', 'age_max']
).transform_joinaggregate(
    total='sum(count)'  
).transform_calculate(
    pct='datum.count / datum.total'  
).mark_bar().encode(
    alt.X("age_min:Q", bin='binned'),
    x2='age_max',
    y=alt.Y('pct:Q', axis=alt.Axis(format='%'))
)

在此处输入图片说明

I'm hoping that we'll be able to streamline the transform API in the future.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM