[英]How to add error bars to a grouped bar plot?
我想在我的 plot 中添加错误栏,我可以显示每个 plot 的最小最大值。拜托,任何人都可以帮助我。 提前致谢。
最小最大值如下:
延迟 = (53.46 (min 0, max60), 36.22 (min 12,max 70), 83 (min 21,max 54), 17 (min 12,max 70)) 延迟 = (38 (min 2,max 70), 44(最小 12,最大 87),53(最小 9,最大 60),10(最小 11,最大 77))
import matplotlib.pyplot as plt
import pandas as pd
from pandas import DataFrame
from matplotlib.dates import date2num
import datetime
Delay = (53.46, 36.22, 83, 17)
Latency = (38, 44, 53, 10)
index = ['T=0', 'T=26', 'T=50','T=900']
df = pd.DataFrame({'Delay': Delay, 'Latency': Latency}, index=index)
ax = df.plot.bar(rot=0)
plt.xlabel('Time')
plt.ylabel('(%)')
plt.ylim(0, 101)
plt.savefig('TestX.png', dpi=300, bbox_inches='tight')
plt.show()
ndarray
,每列一个matplotlib.axes.Axes
。
ax.patches
包含 8 个matplotlib.patches.Rectangle
对象,每个条形的每个段一个。
height
、 width
和x
位置,并使用plt.vlines
绘制一条线。height
用于从dict
, z
提取正确的min
和max
。
Delay
& Latency
)。import pandas as pd
import matplotlib.pyplot as plt
# create dataframe
Delay = (53.46, 36.22, 83, 17)
Latency = (38, 44, 53, 10)
index = ['T=0', 'T=26', 'T=50','T=900']
df = pd.DataFrame({'Delay': Delay, 'Latency': Latency}, index=index)
# dicts with errors
Delay_error = {53.46: {'min': 0,'max': 60}, 36.22: {'min': 12,'max': 70}, 83: {'min': 21,'max': 54}, 17: {'min': 12,'max': 70}}
Latency_error = {38: {'min': 2, 'max': 70}, 44: {'min': 12,'max': 87}, 53: {'min': 9,'max': 60}, 10: {'min': 11,'max': 77}}
# combine them; providing all the keys are unique
z = {**Delay_error, **Latency_error}
# plot
ax = df.plot.bar(rot=0)
plt.xlabel('Time')
plt.ylabel('(%)')
plt.ylim(0, 101)
for p in ax.patches:
x = p.get_x() # get the bottom left x corner of the bar
w = p.get_width() # get width of bar
h = p.get_height() # get height of bar
min_y = z[h]['min'] # use h to get min from dict z
max_y = z[h]['max'] # use h to get max from dict z
plt.vlines(x+w/2, min_y, max_y, color='k') # draw a vertical line
dicts
存在非唯一值,因此无法组合,我们可以根据条形图顺序选择正确的dict
。Dalay
bar,4-7 是Latency
barfor i, p in enumerate(ax.patches):
print(i, p)
x = p.get_x()
w = p.get_width()
h = p.get_height()
if i < len(ax.patches)/2: # select which dictionary to use
d = Delay_error
else:
d = Latency_error
min_y = d[h]['min']
max_y = d[h]['max']
plt.vlines(x+w/2, min_y, max_y, color='k')
一些压缩和堆叠就足够了——参见下面的bar_min_maxs
。 简化并略微概括特伦顿的代码:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
# create dataframe
Delay = (53.46, 36.22, 83, 17)
Latency = (38, 44, 53, 10)
index = ['T=0', 'T=26', 'T=50','T=900']
df = pd.DataFrame({'Delay': Delay, 'Latency': Latency,
'Delay_min': (0, 12, 21, 12), # supply min and max
'Delay_max': (60, 70, 54, 70),
'Latency_min': (2, 12, 9, 11),
'Latency_max': (70, 87, 60, 77)},
index=index)
# plot
ax = df[['Delay', 'Latency']].plot.bar(rot=0)
plt.xlabel('Time')
plt.ylabel('(%)')
plt.ylim(0, 101)
# bar_min_maxs[i] is bar/patch i's min, max
bar_min_maxs = np.vstack((list(zip(df['Delay_min'], df['Delay_max'])),
list(zip(df['Latency_min'], df['Latency_max']))))
assert len(bar_min_maxs) == len(ax.patches)
for patch, (min_y, max_y) in zip(ax.patches, bar_min_maxs):
plt.vlines(patch.get_x() + patch.get_width()/2,
min_y, max_y, color='k')
如果错误栏是通过错误幅度而不是最小值和最大值来表示的,即错误栏以栏的高度 w/长度 2 x 错误幅度为中心,那么这里是 plot 的代码:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
# create dataframe
Delay = (53.46, 36.22, 83, 17)
Latency = (38, 44, 53, 10)
index = ['T=0', 'T=26', 'T=50','T=900']
df = pd.DataFrame({'Delay': Delay, 'Latency': Latency,
'Delay_moe': (5, 15, 25, 35), # supply margin of error
'Latency_moe': (10, 20, 30, 40)},
index=index)
# plot
ax = df[['Delay', 'Latency']].plot.bar(rot=0)
plt.xlabel('Time')
plt.ylabel('(%)')
plt.ylim(0, 101)
# bar_moes[i] is bar/patch i's margin of error, i.e., half the length of an
# errorbar centered at the bar's height
bar_moes = np.ravel(df[['Delay_moe', 'Latency_moe']].values.T)
assert len(bar_moes) == len(ax.patches)
for patch, moe in zip(ax.patches, bar_moes):
height = patch.get_height() # of bar
min_y, max_y = height - moe, height + moe
plt.vlines(patch.get_x() + patch.get_width()/2,
min_y, max_y, color='k')
一个小的统计说明:如果对两组之间的差异 b/t(每个 T=t 的延迟和延迟)感兴趣,则为差异添加 plot,并为差异添加误差条。 像上面这样的 plot 不足以直接分析差异; 例如,如果两个误差条在 T=0 处重叠,这并不意味着 b/t 延迟和延迟的差异在使用的任何级别上都不具有统计显着性。 (尽管如果它们不重叠,那么差异在统计上是显着的。)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.