简体   繁体   English

熊猫和matplotlib的置信区间为95%

[英]95% confidence interval in pandas and matplotlib

I don't know why this is taking me so long to figure out, but I cannot find a way to plot an error bar of the confidence interval of my data. 我不知道为什么要花这么长时间才能弄清楚,但是我无法找到一种方法来绘制数据置信区间的误差线。

I have some data in a Python list. 我在Python列表中有一些数据。

I found this code from another question to calculate the 95% confidence interval of some data. 我从另一个问题中找到了此代码,以计算某些数据的95%置信区间。

def mean_confidence_interval(data, confidence=0.95):
    a = 1.0 * np.array(data)
    n = len(a)
    m, se = np.mean(a), scipy.stats.sem(a)
    h = se * scipy.stats.t.ppf((1 + confidence) / 2., n-1)
    return m, m-h, m+h

I am using this to get the confidence interval of one bar of my bar chart. 我正在使用它来获取条形图的一个条形的置信区间。 The question is how do I plot the error bar since I have a triple here? 问题是既然在这里有三元组,如何绘制误差线? Do I just plot the max out of these values per bar? 我是否只绘制每根线中这些值的max

Edit 编辑

I tried to implement what was suggested in the comments. 我尝试执行评论中建议的内容。 Let's say I have 3 bars in my chart, then I created a 2X3 list containing in the first row mh values of each bar and in the second row m+h values of each bar. 假设我的图表中有3条,然后创建了一个2X3列表,其中第一行包含每个条的mh值,第二行包含每个条的m+h值。 Giving this to the chart however produces some strange error bars (for example one bar spans beyond 500 although I don't have such value in my errors). 然而,将其提供给图表会产生一些奇怪的误差线(例如,一个误差线跨度超过500,尽管我在误差中没有这样的值)。

[[200.0446804785922, 109.31657288869792, 93.43052190866868], 
[200.0957195214078, 222.0113671113021, 217.6619980913313]]

Using Seaborn and Pandas this is really easy: 使用Seaborn和Pandas,这真的很容易:

import pandas as pd
import seaborn as sns

pd_df = pd.DataFrame(your_list, columns=['x_data', 'y_data', 'group_categories'])
sns.lineplot(data=pd_df, 
             x='x_data', y='y_data', hue='group_categories', ci=95,
             legend="full", palette="Set1")

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM