简体   繁体   中英

Altair bar chart with bars of variable width?

I'm trying to use Altair in Python to make a bar chart where the bars have varying width depending on the data in a column of the source dataframe. The ultimate goal is to get a chart like this one:

带有可变宽度条形的条形图

The height of the bars corresponds to a marginal-cost of each energy-technology (given as a column in the source dataframe). The bar width corresponds to the capacity of each energy-technology (also given as a columns in the source dataframe). Colors are ordinal data also from the source dataframe. The bars are sorted in increasing order of marginal cost. (A plot like this is called a "generation stack" in the energy industry). This is easy to achieve in matplotlib like shown in the code below:

import matplotlib.pyplot as plt 

# Make fake dataset
height = [3, 12, 5, 18, 45]
bars = ('A', 'B', 'C', 'D', 'E')

# Choose the width of each bar and their positions
width = [0.1,0.2,3,1.5,0.3]
y_pos = [0,0.3,2,4.5,5.5]

# Make the plot
plt.bar(y_pos, height, width=width)
plt.xticks(y_pos, bars)
plt.show()

(code from https://python-graph-gallery.com/5-control-width-and-space-in-barplots/ )

But is there a way to do this with Altair? I would want to do this with Altair so I can still get the other great features of Altair like a tooltip, selectors/bindings as I have lots of other data I want to show alongside the bar-chart.

First 20 rows of my source data looks like this:

在此处输入图片说明

(does not match exactly the chart shown above).

In Altair, the way to do this would be to use the rect mark and construct your bars explicitly. Here is an example that mimics your data:

import altair as alt
import pandas as pd
import numpy as np

np.random.seed(0)

df = pd.DataFrame({
    'MarginalCost': 100 * np.random.rand(30),
    'Capacity': 10 * np.random.rand(30),
    'Technology': np.random.choice(['SOLAR', 'THERMAL', 'WIND', 'GAS'], 30)
})

df = df.sort_values('MarginalCost')
df['x1'] = df['Capacity'].cumsum()
df['x0'] = df['x1'].shift(fill_value=0)

alt.Chart(df).mark_rect().encode(
    x=alt.X('x0:Q', title='Capacity'),
    x2='x1',
    y=alt.Y('MarginalCost:Q', title='Marginal Cost'),
    color='Technology:N',
    tooltip=["Technology", "Capacity", "MarginalCost"]
)

在此处输入图片说明

To get the same result without preprocessing of the data, you can use Altair's transform syntax:

df = pd.DataFrame({
    'MarginalCost': 100 * np.random.rand(30),
    'Capacity': 10 * np.random.rand(30),
    'Technology': np.random.choice(['SOLAR', 'THERMAL', 'WIND', 'GAS'], 30)
})

alt.Chart(df).transform_window(
    x1='sum(Capacity)',
    sort=[alt.SortField('MarginalCost')]
).transform_calculate(
    x0='datum.x1 - datum.Capacity'
).mark_rect().encode(
    x=alt.X('x0:Q', title='Capacity'),
    x2='x1',
    y=alt.Y('MarginalCost:Q', title='Marginal Cost'),
    color='Technology:N',
    tooltip=["Technology", "Capacity", "MarginalCost"]
)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM