I have a feeling there is a very simple way of doing this. I'm trying to plot a timeline of a tasks running on an an environment, incl. two plots on the same diagram:
broken_barh
In the example there were 6 tasks running (AF), for various lengths, with different start times. They are plotted exactly as I need (1/), in a gant-like chart, time on the X axis.
import numpy as np
import pandas as pd
%matplotlib inline
import matplotlib as mpl
from matplotlib import pyplot as plt
cols=['ID','From','To']
df = pd.DataFrame([['A', 736758.993, 736758.995], ['B', 736758.995, 736758.998],
['C', 736758.994, 736758.996], ['D', 736758.996, 736758.997],
['E', 736758.996, 736758.997], ['F', 736758.995, 736758.996]],
columns=cols)
df['Diff'] = df['To']-df['From']
fig,ax=plt.subplots()
for i, slice in df.iterrows():
values = [[slice['From'], slice['Diff']]]
ax.broken_barh((values), (i-0.4,0.8), color=np.random.rand(3))
ax.xaxis_date()
To this I would like to add 2/ a curve, showing the active task count at each time (1 between 23:51-23:52, 2 for 23:52-53 etc., peaking around 23:54)
The problem with this is that I cannot just draw a histogram of the start times, since the different task overlap in time. Do you know a decent way to create such histogram?
I am pretty sure there are cleaner ways to approach this. Especially the float math problems were pretty annoying, when trying to create the histogram. The first part is a simple one liner, though. Just use, as suggested, hlines
and increase the linewidth
to create your bar chart.
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.cm as cm
df = pd.DataFrame([['A', 736758.993, 736758.995], ['B', 736758.995, 736758.998],
['C', 736758.994, 736758.996], ['D', 736758.994, 736758.997],
['E', 736758.997, 736758.998], ['F', 736758.995, 736758.999]],
columns = ['ID','From','To'])
#create two subplots with shared x axis
fig, (ax1, ax2) = plt.subplots(2, 1, sharex = True)
#plot1 - Gantt chart for individual IDs
ax1.hlines(df.ID, df.From, df.To, colors = cm.inferno(df.index/len(df)), linewidth = 20)
#plot 2 - make table of time series for each ID - multiply by 1000 to avoid float problems
hist_count = df.apply(lambda row: pd.Series(np.arange(1000 * row["From"], 1000 * row["To"])), axis = 1)
hist_count = pd.melt(hist_count)["value"].dropna().astype(int)
#find borders for bins
min_time = hist_count.min(axis = 0)
max_time = hist_count.max(axis = 0)
#plot 2 histogram - add 0.0001 to prevent arbitrary binning due to float problems
ax2.hist(hist_count / 1000 + 0.0001, range = (min_time / 1000, (max_time + 1) / 1000), bins = max_time - min_time + 1)
ax2.xaxis_date()
plt.show()
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.