I have a data series consisting of monthly sales for individual fiscal years. I am using a pandas
dataframe to store the data. Each fiscal year starts on the first day of March and ends on the last day of the February in the following year. I am using a plotly
facet plot to show the months of the year vertically aligned, so that March 2021 is below March 2020, and so on.
Despite using a categorical variable for the x-axis the ordering is slightly off. I have tried sorting using a 'yearmon' variable with unique values, but that doesn't work either. Specifically, in the plot below the values for Jan and Feb in 2018 are blank, and Jan and Feb 2021 are also out of place. How can I get the facet to show contiguous data without these problems? Edit: I have a feeling it is related to the ordering of the categories, but haven't managed to pin it down yet.
import pandas as pd
import numpy as np
import plotly.express as px
import chart_studio.plotly as py
rng = np.random.default_rng(12345)
df = pd.DataFrame(rng.integers(80, 100, size=(36, 1)), columns=list('A'))
df.index = pd.date_range("2018-03-01", periods=36, freq="M")
df['year'] = df.index.strftime('%Y')
df['month'] = df.index.strftime('%b')
df['monthindex'] = df.index.strftime('%m')
df['yearmon'] = df['year']+df['monthindex']
month_categories = ['Mar','Apr','May','Jun','Jul','Aug','Sep','Oct','Nov','Dec','Jan','Feb']
df["month"] = pd.Categorical(df["month"], categories = month_categories)
df = df.sort_values(by = "yearmon")
fig = px.bar(df, x = 'month', y = 'A', facet_col='year', facet_col_wrap=1)
py.image.save_as(fig, 'plotly.png', width=1000, height=500)
UPDATE
Using @vestland's code below as a base, I have tweaked the start date and the fiscal year assignment as per my comment below, because fiscal years are often not aligned with the calendar year. Also, the length of the data series is arbitrary - it might be a few months, it might be a decade - and so are the start and end months. Finally, I would like the x-axis to begin and end with the first and last months of the fiscal year, so in this case (March and February) 'Mar' should be the first tick mark on the left, and 'Feb' the last one on the right. My apologies if this was not sufficiently clear.
import pandas as pd
import numpy as np
import plotly.express as px
import chart_studio.plotly as py
rng = np.random.default_rng(12345)
df = pd.DataFrame(rng.integers(80, 100, size=(36, 1)), columns=list('A'))
df.index = pd.date_range("2018-01-01", periods=36, freq="M")
df['year'] = df.index.strftime('%Y')
df['month'] = df.index.strftime('%b')
df['monthindex'] = df.index.strftime('%m')
df['yearmon'] = df['year']+df['monthindex']
month_categories = ['Mar','Apr','May','Jun','Jul','Aug','Sep','Oct','Nov','Dec','Jan','Feb']
df["month"] = pd.Categorical(df["month"], categories = month_categories)
df = df.sort_values(by = "yearmon")
df['fiscal_year'] = [2017]*2+[2018]*12+[2019]*12+[2020]*10
fig = px.bar(df, x = 'month', y = 'A', facet_col='fiscal_year', facet_col_wrap=1)
fig.show()
If I understand correctly, then you seem to be doing everything right besides one minor detail. Which is a bit surprising, so there's a fair chance I've misunderstand the premise of your question. Anyway...
Specifically, in the plot below the values for Jan and Feb in 2018 are blank
That's because no such dates exist in df.head()
A year month monthindex yearmon
2018-03-31 93 2018 Mar 03 201803
2018-04-30 84 2018 Apr 04 201804
2018-05-31 95 2018 May 05 201805
2018-06-30 86 2018 Jun 06 201806
2018-07-31 84 2018 Jul 07 201807
And if I understand your intentions correctly, You would in fact like to associate january and february of 2019
with the first x-axis. And despite your thorough effort, no such association has been made. And I'm not quite sure how you would do that, but if you make sure to set up something like this:
df['fiscal_year'] = [2018]*12+[2019]*12+[2020]*12
And get:
Then you can run
fig = px.bar(df, x = 'month', y = 'A', facet_col='fiscal_year',facet_col_wrap=1)
And get:
As you can see, January and february of 2019
now appears on the x-axis of 2018. And so on for the rest of the years. I hope this is what you were looking for. Don't hesitate to let me know if not.
import pandas as pd
import numpy as np
import plotly.express as px
import chart_studio.plotly as py
rng = np.random.default_rng(12345)
df = pd.DataFrame(rng.integers(80, 100, size=(36, 1)), columns=list('A'))
df.index = pd.date_range("2018-03-01", periods=36, freq="M")
df['year'] = df.index.strftime('%Y')
df['month'] = df.index.strftime('%b')
df['monthindex'] = df.index.strftime('%m')
df['yearmon'] = df['year']+df['monthindex']
month_categories = ['Mar','Apr','May','Jun','Jul','Aug','Sep','Oct','Nov','Dec','Jan','Feb']
df["month"] = pd.Categorical(df["month"], categories = month_categories)
df = df.sort_values(by = "yearmon")
df['fiscal_year'] = [2018]*12+[2019]*12+[2020]*12
fig = px.bar(df, x = 'month', y = 'A', facet_col='fiscal_year', facet_col_wrap=1)
fig.show()
The issue in this case appears to be that plotly does not respect the order of the categories in the pandas data series used for the x-axis unless specifically instructed to do so, as pointed out in the plotly forum here , and documented here . Using category_orders
in the px.bar
call allows us to override the default plotly assumption and create an x-axis that runs from the first month of the fiscal year specified to the last month of the fiscal year.
import pandas as pd
import numpy as np
import plotly.express as px
import chart_studio.plotly as py
rng = np.random.default_rng(12345)
df = pd.DataFrame(rng.integers(80, 100, size=(36, 1)), columns=list('A'))
df.index = pd.date_range("2018-01-01", periods=36, freq="M")
df['year'] = df.index.strftime('%Y')
df['month'] = df.index.strftime('%b')
df['monthindex'] = df.index.strftime('%m')
df['yearmon'] = df['year']+df['monthindex']
month_categories = ['Mar','Apr','May','Jun','Jul','Aug','Sep','Oct','Nov','Dec','Jan','Feb']
df["month"] = pd.Categorical(df["month"], categories = month_categories)
df = df.sort_values(by = "yearmon")
df['fiscal_year'] = [2017]*2+[2018]*12+[2019]*12+[2020]*10
fig = px.bar(df, x = 'month', y = 'A',
facet_col='fiscal_year',
facet_col_wrap=1,
category_orders={ # replaces default order by column name
"month": ['Mar','Apr','May','Jun','Jul','Aug','Sep','Oct','Nov','Dec','Jan','Feb']
})
fig.show()
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.