繁体   English   中英

如何在 PyPlot 中的分类箱线图上绘制拟合曲线? 为什么结果与 Google Sheets 中的同一个图不同?

[英]How to plot a fitted curve over a categorical boxplot in PyPlot? Why does the result differ from the same plot in Google Sheets?

我有以下 csv 数据:

Dataset Size,MAPE,MAE,STD MAPE,STD MAE
35000,0.0715392337,23.38300578,0.9078698348,2.80407539
26250,0.06893431034,22.34732326,0.9833948236,1.926517044
17500,0.0756695622,26.0900766,0.6055443674,8.842862631
8750,0.07176532526,23.02646184,0.8284005282,2.190506033
4200,0.08661127364,29.89234607,0.9395831421,7.587818412
2100,0.08072315267,27.20110884,0.03956974712,4.948606892
1050,0.07505202908,27.04025924,0.841966778,4.550482956
700,0.07703248113,26.17923045,0.4468447145,1.523638508
350,0.08695408769,32.35331585,0.7891190087,4.18648457
200,0.09770903032,30.96197823,0.04648972591,3.892800694
170,0.1202382169,41.87828814,0.7257680584,6.70453713
150,0.1960949784,77.20321559,0.5661066006,21.57418682

根据上述数据,我想使用 matplotlib 或类似(seaborn、pandas 等)生成以下图:

在 Google 表格中生成的示例图

from pathlib import Path
from matplotlib import animation
import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
from scipy.optimize import curve_fit

nr_datapoints = 10
def exponenial_func(x, a, b, c):
    return a*np.exp(-b*x)+c
def myplot(data_file):
    df = pd.read_csv(data_file)
    print(df.head())

    fig, ax = plt.subplots()

    # Exponential line fit
    popt, pcov = curve_fit(exponenial_func, np.array([float(i) for i in range(len(df['Dataset Size']))]), df['MAPE'], p0=(0, 0.0145, 0.0823))
    xp = np.linspace(0,len(df['Dataset Size']), 100)  
    plt.plot(xp, exponenial_func(xp, *popt), color = 'g')
    # barplote with error bars
    ax.bar([str(s) for s in df['Dataset Size']], df['MAPE'], yerr=df['STD MAPE'])
    plt.title('Accuracy of Model vs. Dataset Size')
    plt.xlabel('Dataset Size')
    plt.ylabel('Mean Absolute Percentage Error')
    fig.tight_layout()
    plt.show()

我得到的情节如下: 上面代码生成的图

尽管对数据拟合了指数函数,为什么我的代码最终得到一条线而不是一条曲线? (鉴于谷歌表格图做同样的事情,例如拟合数据的指数曲线)

尝试了一些函数,我想我可以肯定地说,Google Sheets 指数函数的形式与此接近:

def sheetey_exponential_function(x, a, b, c):
    return a * b ** (x + c)

在此处输入图片说明

问题是水平轴不是线性的。 实际上它是逆线性的。 所以,如果你想你适合看起来像一个指数函数,你需要更换x1/x

def exponenial_func(x, a, b, c):
    return a*np.exp(-b/x)+c

结果如下: 在此处输入图片说明

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM