[英]How to boxplot multiple dictionaries on the same plot
For a change point detection task, I am testing my own algorithm against a baseline, and I would like to plot the results from the two algorithms on the same boxplot.对于变化点检测任务,我正在根据基线测试我自己的算法,并且我想将两种算法的结果绘制在同一个箱线图上。
My results (F Score values) are stored in a dictionary where the keys are two parameters a
and b
(both with 4 distinct values):我的结果(F 分数值)存储在字典中,其中键是两个参数a
和b
(都有 4 个不同的值):
resultsOwnAlgorithm = {'a1, b1': [0.8, 0.7, 0.6, ...], 'a1, b2': [...], ..., 'a2, b1': [...], ...}
resultsBaseline = {'a1, b1': [0.7, 0.6, ...], 'a1, b2': [...], ..., 'a2, b1': [...], ...}
For now, I have a function to plot them individually.现在,我有一个单独绘制它们的功能。 I create 4 subplots where a
is set and b
is changing, see image (values are random, just to create an example image).我创建了 4 个子图,其中设置a
并且b
正在更改,请参见图像(值是随机的,只是为了创建示例图像)。 The function looks like this:该函数如下所示:
def plotResults(results, keys, test):
fig, axs = plt.subplots(2,2,figsize=(10,10))
for ax in axs.flat:
ax.set_ylim(0,1)
ax.set_xticks(range(len(abrs)))
ax.set_xticklabels(abrs)
count = 0
for i in (0,1):
for j in (0,1):
axs[i,j].set_title(str(test) + ', mean shift: ' + str(keys[count][0][0:2]).strip('x,') + ', iters=' + str(iterations), fontweight ="bold")
l = keys[count]
k = {k:results[k] for k in l if k in results}
label, data = k.keys(), k.values()
axs[i,j].boxplot(data,showfliers=False,patch_artist=True)
axs[i,j].set_xticks(range(1, len(label) + 1))
axs[i,j].set_xticklabels(label)
count+=1
where results
is either resultsOwnAlgorithm
or resultsBaseline
, keys
is the dicitonary keys, so the different combinations of a
and b
, and test
is just used to put which algorithm is being plotted in the title.其中results
是resultsOwnAlgorithm
或resultsBaseline
, keys
是字典键,因此a
和b
的不同组合和test
仅用于将正在绘制的算法放在标题中。
My question is: how do I plot them side by side on the same plot?我的问题是:我如何在同一个情节上并排绘制它们?
There's a few errors in your plotting function, so I could get it to work without making great assumptions, like what abrs
is and what iterations
is.您的绘图功能中有一些错误,所以我可以在不做很大假设的情况下让它工作,比如abrs
是什么以及iterations
是什么。 You should fix them before continuing with your work as this function is getting them likely from the global scope (assuming a jupyter notebook) and that will lead to bugs later on, as I've painfully experienced before.你应该在继续你的工作之前修复它们,因为这个函数很可能从全局范围(假设是一个 jupyter notebook)中获取它们,这将导致稍后出现错误,正如我以前痛苦地经历过的那样。
Anyway, your problem can be tackled first by adapting your code to use seaborn.无论如何,您的问题可以首先通过调整您的代码以使用 seaborn 来解决。 Check the example here , "Draw a boxplot with nested grouping by two categorical variables" .检查此处的示例, “通过两个分类变量绘制具有嵌套分组的箱线图” 。
The method that can be more easily modified to fit your usecase is this: Generate a set of x
values that will be associated with each boxplot group.可以更轻松地修改以适合您的用例的方法是:生成一组将与每个箱线图组关联的x
值。 Then, add a shift to the left or right depending on where you want to place this boxplot.然后,根据您要放置此箱线图的位置,向左或向右添加偏移。 Then you have to fix the ticks and so on, but you already know how to do that.然后你必须修复蜱虫等等,但你已经知道如何去做了。 Here's an example that maintains as much as possible of your structure.这是一个尽可能多地维护您的结构的示例。
import matplotlib.pyplot as plt
import numpy as np
np.random.seed(42)
resultsOwnAlgorithm = {'a1, b1': np.random.normal(scale=2, size=20), 'a2, b2': np.random.normal(scale=1.5, size=20)}
resultsBaseline = {'a1, b1': np.random.normal(scale=2, size=20), 'a2, b2': np.random.normal(scale=1.5, size=20)}
x_vals = np.arange(0, len(resultsOwnAlgorithm))
xs = {key:val for key, val in zip(resultsOwnAlgorithm.keys(), x_vals)}
shift = 0.1
fig, ax = plt.subplots()
for key in resultsOwnAlgorithm.keys():
ax.boxplot(resultsOwnAlgorithm[key], positions=[xs[key] - shift], boxprops=dict(color='r'))
ax.boxplot(resultsBaseline[key], positions=[xs[key] + shift], boxprops=dict(color='b'))
ax.set_xticks(x_vals)
ax.set_xticklabels(resultsOwnAlgorithm.keys())
pands.DataFrame
.最简单的解决方案可能是将所有字典组合成一个pands.DataFrame
。 This will make the data easy to analyze, and plot.这将使数据易于分析和绘图。
pd.concat
, and reset the index.将 DataFrame 列表与pd.concat
结合,并重置索引。pd.DataFrame.melt
使用pd.DataFrame.melt
将 DataFrame 重塑为长形hue
parameter. Seaborn是 matplotlib 的高级 api,可以轻松绘制长格式数据并通过hue
参数分隔组。
sns.catplot
with kind='box'
, or use the axes-level plotsns.boxplot
.使用带有kind='box'
的图形级绘图sns.catplot
,或使用轴级绘图sns.boxplot
。matplotlib.axes.Axes.boxplot
, where positions=
must be specified for each extra group of boxplots. Seaborn 比使用多次调用matplotlib.axes.Axes.boxplot
更容易扩展,其中每个额外的箱线图组必须指定positions=
。python 3.10
, pandas 1.4.2
, matplotlib 3.5.1
, seaborn 0.11.2
在python 3.10
、 pandas 1.4.2
、 matplotlib 3.5.1
、 seaborn 0.11.2
import seaborn as sns
import pandas as pd
import matplotlib.pyplot as plt
# create sample dictionaries
np.random.seed(2022)
custom = {f'a{v}, b{v}': np.random.normal(scale=v, size=100) for v in range(1, 5)}
baseline = {f'a{i}, b{i}': np.random.normal(scale=v, size=100) for i, v in enumerate(np.arange(1.5, 5.5), 1)}
# create and shape dataframe
dfs = list()
for d, _id in zip([resultsBaseline, resultsOwnAlgorithm], ['baseline', 'custom']):
df = pd.DataFrame(d)
df['Algorithm'] = _id
dfs.append(df)
dfs = pd.concat(dfs).reset_index(drop=True)
dfm = dfs.melt(id_vars='Algorithm', var_name='Parameters', value_name='Score')
# plot
g = sns.catplot(kind='box', data=dfm, x='Parameters', y='Score', hue='Algorithm', height=6, aspect=2)
plt.show
dfs.head()
a1, b1 a2, b2 a3, b3 a4, b4 Algorithm
0 0.834463 -1.092923 4.875117 -4.946214 baseline
1 1.338891 0.225008 -0.305499 0.570333 baseline
2 0.261615 2.128844 2.194177 0.494803 baseline
3 0.273740 -2.395624 -3.495572 0.006312 baseline
4 -0.997368 0.984808 -3.956302 0.206667 baseline
dfm.head()
Algorithm Parameters Score
0 baseline a1, b1 0.834463
1 baseline a1, b1 1.338891
2 baseline a1, b1 0.261615
3 baseline a1, b1 0.273740
4 baseline a1, b1 -0.997368
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.