[英]Plotting errors bars from dataframe using Seaborn FacetGrid
I want to plot error bars from a column in a pandas dataframe on a Seaborn FacetGrid 我想在Seaborn FacetGrid上的pandas数据框中的列中绘制误差条
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns
df = pd.DataFrame({'A' : ['foo', 'bar', 'foo', 'bar']*2,
'B' : ['one', 'one', 'two', 'three',
'two', 'two', 'one', 'three'],
'C' : np.random.randn(8),
'D' : np.random.randn(8)})
df
Example dataframe 示例数据帧
A B C D
0 foo one 0.445827 -0.311863
1 bar one 0.862154 -0.229065
2 foo two 0.290981 -0.835301
3 bar three 0.995732 0.356807
4 foo two 0.029311 0.631812
5 bar two 0.023164 -0.468248
6 foo one -1.568248 2.508461
7 bar three -0.407807 0.319404
This code works for fixed size error bars: 此代码适用于固定大小的错误栏:
g = sns.FacetGrid(df, col="A", hue="B", size =5)
g.map(plt.errorbar, "C", "D",yerr=0.5, fmt='o');
But I can't get it to work using values from the dataframe 但我无法使用数据框中的值来使其工作
df['E'] = abs(df['D']*0.5)
g = sns.FacetGrid(df, col="A", hue="B", size =5)
g.map(plt.errorbar, "C", "D", yerr=df['E']);
or 要么
g = sns.FacetGrid(df, col="A", hue="B", size =5)
g.map(plt.errorbar, "C", "D", yerr='E');
both produce screeds of errors 两者都会产生错误
EDIT: 编辑:
After lots of matplotlib doc reading, and assorted stackoverflow answers, here is a pure matplotlib solution 经过大量的matplotlib doc阅读和各种stackoverflow的解答,这里有一个纯matplotlib解决方案
#define a color palette index based on column 'B'
df['cind'] = pd.Categorical(df['B']).labels
#how many categories in column 'A'
cats = df['A'].unique()
cats.sort()
#get the seaborn colour palette and convert to array
cp = sns.color_palette()
cpa = np.array(cp)
#draw a subplot for each category in column "A"
fig, axs = plt.subplots(nrows=1, ncols=len(cats), sharey=True)
for i,ax in enumerate(axs):
df_sub = df[df['A'] == cats[i]]
col = cpa[df_sub['cind']]
ax.scatter(df_sub['C'], df_sub['D'], c=col)
eb = ax.errorbar(df_sub['C'], df_sub['D'], yerr=df_sub['E'], fmt=None)
a, (b, c), (d,) = eb.lines
d.set_color(col)
Other than the labels, and axis limits its OK. 除了标签,轴限制其OK。 Its plotted a separate subplot for each category in column 'A', colored by the category in column 'B'.
它为“A”列中的每个类别绘制了一个单独的子图,由“B”列中的类别着色。 (Note the random data is different to that above)
(注意随机数据与上面的不同)
I'd still like a pandas/seaborn solution if anyone has any ideas? 如果有人有任何想法,我仍然喜欢大熊猫/海豹的解决方案吗?
When using FacetGrid.map
, anything that refers to the data
DataFrame must be passed as a positional argument. 使用
FacetGrid.map
,任何引用data
DataFrame的内容都必须作为位置参数传递。 This will work in your case because yerr
is the third positional argument for plt.errorbar
, though to demonstrate I'm going to use the tips dataset: 这将在你的情况下工作,因为
yerr
是第三个位置参数plt.errorbar
,虽然证明我将使用技巧集:
from scipy import stats
tips_all = sns.load_dataset("tips")
tips_grouped = tips_all.groupby(["smoker", "size"])
tips = tips_grouped.mean()
tips["CI"] = tips_grouped.total_bill.apply(stats.sem) * 1.96
tips.reset_index(inplace=True)
I can then plot using FacetGrid
and errorbar
: 然后我可以使用
FacetGrid
和errorbar
:
g = sns.FacetGrid(tips, col="smoker", size=5)
g.map(plt.errorbar, "size", "total_bill", "CI", marker="o")
However, keep in mind that the there are seaborn plotting functions for going from a full dataset to plots with errorbars (using bootstrapping), so for a lot of applications this may not be necessary. 但是,请记住,有一个seaborn绘图功能,用于从完整数据集转到带有错误栏的图(使用自举),因此对于许多应用程序而言,这可能不是必需的。 For example, you could use
factorplot
: 例如,您可以使用
factorplot
:
sns.factorplot("size", "total_bill", col="smoker",
data=tips_all, kind="point")
Or lmplot
: 或者
lmplot
:
sns.lmplot("size", "total_bill", col="smoker",
data=tips_all, fit_reg=False, x_estimator=np.mean)
You aren't showing what df['E']
actually is, and if it is a list of the same length as df['C']
and df['D']
. 你没有显示
df['E']
实际上是什么,以及它是否是与df['C']
和df['D']
相同长度的列表。
The yerr
keyword argument (kwarg) takes either a single value that will be applied for every element in the lists for keys C and D from the dataframe, or it needs a list of values the same length as those lists. yerr
关键字参数(kwarg)采用单个值,该值将应用于数据帧中键C和D的列表中的每个元素,或者它需要与这些列表长度相同的值列表。
So, C, D, and E must all be associated with lists of the same length, or C and D must be lists of the same length and E must be associated with a single float
or int
. 因此,C,D和E必须都与相同长度的列表相关联,或者C和D必须是相同长度的列表,并且E必须与单个
float
或int
相关联。 If that single float
or int
is inside a list, you must extract it, like df['E'][0]
. 如果单个
float
或int
在列表中,则必须将其解压缩,如df['E'][0]
。
Example matplotlib
code with yerr
: http://matplotlib.org/1.2.1/examples/pylab_examples/errorbar_demo.html 示例
matplotlib
代码与yerr
: http : yerr
Bar plot API documentation describing yerr
: http://matplotlib.org/api/pyplot_api.html#matplotlib.pyplot.bar 描述
yerr
条形图API文档: http : yerr
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.