簡體   English   中英

如何在 PyPlot 中的分類箱線圖上繪制擬合曲線? 為什么結果與 Google Sheets 中的同一個圖不同?

[英]How to plot a fitted curve over a categorical boxplot in PyPlot? Why does the result differ from the same plot in Google Sheets?

我有以下 csv 數據:

Dataset Size,MAPE,MAE,STD MAPE,STD MAE
35000,0.0715392337,23.38300578,0.9078698348,2.80407539
26250,0.06893431034,22.34732326,0.9833948236,1.926517044
17500,0.0756695622,26.0900766,0.6055443674,8.842862631
8750,0.07176532526,23.02646184,0.8284005282,2.190506033
4200,0.08661127364,29.89234607,0.9395831421,7.587818412
2100,0.08072315267,27.20110884,0.03956974712,4.948606892
1050,0.07505202908,27.04025924,0.841966778,4.550482956
700,0.07703248113,26.17923045,0.4468447145,1.523638508
350,0.08695408769,32.35331585,0.7891190087,4.18648457
200,0.09770903032,30.96197823,0.04648972591,3.892800694
170,0.1202382169,41.87828814,0.7257680584,6.70453713
150,0.1960949784,77.20321559,0.5661066006,21.57418682

根據上述數據,我想使用 matplotlib 或類似(seaborn、pandas 等)生成以下圖:

在 Google 表格中生成的示例圖

from pathlib import Path
from matplotlib import animation
import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
from scipy.optimize import curve_fit

nr_datapoints = 10
def exponenial_func(x, a, b, c):
    return a*np.exp(-b*x)+c
def myplot(data_file):
    df = pd.read_csv(data_file)
    print(df.head())

    fig, ax = plt.subplots()

    # Exponential line fit
    popt, pcov = curve_fit(exponenial_func, np.array([float(i) for i in range(len(df['Dataset Size']))]), df['MAPE'], p0=(0, 0.0145, 0.0823))
    xp = np.linspace(0,len(df['Dataset Size']), 100)  
    plt.plot(xp, exponenial_func(xp, *popt), color = 'g')
    # barplote with error bars
    ax.bar([str(s) for s in df['Dataset Size']], df['MAPE'], yerr=df['STD MAPE'])
    plt.title('Accuracy of Model vs. Dataset Size')
    plt.xlabel('Dataset Size')
    plt.ylabel('Mean Absolute Percentage Error')
    fig.tight_layout()
    plt.show()

我得到的情節如下: 上面代碼生成的圖

盡管對數據擬合了指數函數,為什么我的代碼最終得到一條線而不是一條曲線? (鑒於谷歌表格圖做同樣的事情,例如擬合數據的指數曲線)

嘗試了一些函數,我想我可以肯定地說,Google Sheets 指數函數的形式與此接近:

def sheetey_exponential_function(x, a, b, c):
    return a * b ** (x + c)

在此處輸入圖片說明

問題是水平軸不是線性的。 實際上它是逆線性的。 所以,如果你想你適合看起來像一個指數函數,你需要更換x1/x

def exponenial_func(x, a, b, c):
    return a*np.exp(-b/x)+c

結果如下: 在此處輸入圖片說明

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM