如何将错误栏添加到分组栏 plot？

Question

我想在我的 plot 中添加错误栏，我可以显示每个 plot 的最小最大值。拜托，任何人都可以帮助我。 提前致谢。

最小最大值如下：

延迟 = (53.46 (min 0, max60), 36.22 (min 12,max 70), 83 (min 21,max 54), 17 (min 12,max 70)) 延迟 = (38 (min 2,max 70), 44（最小 12，最大 87），53（最小 9，最大 60），10（最小 11，最大 77））

import matplotlib.pyplot as plt
import pandas as pd
from pandas import DataFrame
from matplotlib.dates import date2num
import datetime

Delay = (53.46, 36.22, 83, 17)
Latency = (38, 44, 53, 10)
index = ['T=0', 'T=26', 'T=50','T=900']
df = pd.DataFrame({'Delay': Delay, 'Latency': Latency}, index=index)
ax = df.plot.bar(rot=0)
plt.xlabel('Time')
plt.ylabel('(%)')
plt.ylim(0, 101)
plt.savefig('TestX.png', dpi=300, bbox_inches='tight')
plt.show()

Answer 1

为了在条形图上的正确位置绘图，必须提取每个条形的补丁数据。
返回一个ndarray ，每列一个matplotlib.axes.Axes 。
- 在此图的情况下， ax.patches包含 8 个matplotlib.patches.Rectangle对象，每个条形的每个段一个。
  - 通过使用此对象的关联方法，可以提取height 、 width和x位置，并使用plt.vlines绘制一条线。
条形的height用于从dict , z提取正确的min和max 。
- 不幸的是，补丁数据不包含条形标签（例如Delay & Latency ）。

import pandas as pd
import matplotlib.pyplot as plt

# create dataframe
Delay = (53.46, 36.22, 83, 17)
Latency = (38, 44, 53, 10)
index = ['T=0', 'T=26', 'T=50','T=900']
df = pd.DataFrame({'Delay': Delay, 'Latency': Latency}, index=index)

# dicts with errors
Delay_error = {53.46: {'min': 0,'max': 60}, 36.22: {'min': 12,'max': 70}, 83: {'min': 21,'max': 54}, 17: {'min': 12,'max': 70}}
Latency_error = {38: {'min': 2, 'max': 70}, 44: {'min': 12,'max': 87}, 53: {'min': 9,'max': 60}, 10: {'min': 11,'max': 77}}

# combine them; providing all the keys are unique
z = {**Delay_error, **Latency_error}

# plot
ax = df.plot.bar(rot=0)
plt.xlabel('Time')
plt.ylabel('(%)')
plt.ylim(0, 101)

for p in ax.patches:
    x = p.get_x()  # get the bottom left x corner of the bar
    w = p.get_width()  # get width of bar
    h = p.get_height()  # get height of bar
    min_y = z[h]['min']  # use h to get min from dict z
    max_y = z[h]['max']  # use h to get max from dict z
    plt.vlines(x+w/2, min_y, max_y, color='k')  # draw a vertical line

如果两个dicts存在非唯一值，因此无法组合，我们可以根据条形图顺序选择正确的dict 。
首先绘制单个标签的所有条形图。
- 在这种情况下，索引 0-3 是Dalay bar，4-7 是Latency bar

for i, p in enumerate(ax.patches):
    print(i, p)
    x = p.get_x()
    w = p.get_width()
    h = p.get_height()
    
    if i < len(ax.patches)/2:  # select which dictionary to use
        d = Delay_error
    else:
        d = Latency_error
        
    min_y = d[h]['min']
    max_y = d[h]['max']
    plt.vlines(x+w/2, min_y, max_y, color='k')

Answer 2

一些压缩和堆叠就足够了——参见下面的bar_min_maxs 。 简化并略微概括特伦顿的代码：

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

# create dataframe
Delay = (53.46, 36.22, 83, 17)
Latency = (38, 44, 53, 10)
index = ['T=0', 'T=26', 'T=50','T=900']
df = pd.DataFrame({'Delay': Delay, 'Latency': Latency,
                   'Delay_min':   (0,  12, 21, 12),  # supply min and max
                   'Delay_max':   (60, 70, 54, 70),
                   'Latency_min': (2,  12, 9,  11),
                   'Latency_max': (70, 87, 60, 77)},
                  index=index)

# plot
ax = df[['Delay', 'Latency']].plot.bar(rot=0)
plt.xlabel('Time')
plt.ylabel('(%)')
plt.ylim(0, 101)

# bar_min_maxs[i] is bar/patch i's min, max
bar_min_maxs = np.vstack((list(zip(df['Delay_min'], df['Delay_max'])),
                          list(zip(df['Latency_min'], df['Latency_max']))))
assert len(bar_min_maxs) == len(ax.patches)

for patch, (min_y, max_y) in zip(ax.patches, bar_min_maxs):
    plt.vlines(patch.get_x() + patch.get_width()/2,
               min_y, max_y, color='k')

如果错误栏是通过错误幅度而不是最小值和最大值来表示的，即错误栏以栏的高度 w/长度 2 x 错误幅度为中心，那么这里是 plot 的代码：

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

# create dataframe
Delay = (53.46, 36.22, 83, 17)
Latency = (38, 44, 53, 10)
index = ['T=0', 'T=26', 'T=50','T=900']
df = pd.DataFrame({'Delay': Delay, 'Latency': Latency,
                   'Delay_moe':   (5,  15, 25, 35),  # supply margin of error
                   'Latency_moe': (10, 20, 30, 40)},
                  index=index)

# plot
ax = df[['Delay', 'Latency']].plot.bar(rot=0)
plt.xlabel('Time')
plt.ylabel('(%)')
plt.ylim(0, 101)

# bar_moes[i] is bar/patch i's margin of error, i.e., half the length of an
# errorbar centered at the bar's height
bar_moes = np.ravel(df[['Delay_moe', 'Latency_moe']].values.T)
assert len(bar_moes) == len(ax.patches)

for patch, moe in zip(ax.patches, bar_moes):
    height = patch.get_height() # of bar
    min_y, max_y = height - moe, height + moe
    plt.vlines(patch.get_x() + patch.get_width()/2,
               min_y, max_y, color='k')

一个小的统计说明：如果对两组之间的差异 b/t（每个 T=t 的延迟和延迟）感兴趣，则为差异添加 plot，并为差异添加误差条。 像上面这样的 plot 不足以直接分析差异； 例如，如果两个误差条在 T=0 处重叠，这并不意味着 b/t 延迟和延迟的差异在使用的任何级别上都不具有统计显着性。 （尽管如果它们不重叠，那么差异在统计上是显着的。）

如何将错误栏添加到分组栏 plot？

问题描述

2 个解决方案

解决方案1
1 已采纳 2020-09-13 01:29:27

解决方案2
0 2022-11-19 01:39:32

如何将错误栏添加到分组栏 plot？

问题描述

2 个解决方案

解决方案1 1 已采纳 2020-09-13 01:29:27

解决方案2 0 2022-11-19 01:39:32

解决方案1
1 已采纳 2020-09-13 01:29:27

解决方案2
0 2022-11-19 01:39:32