简体   繁体   English

大熊猫的指数拟合

[英]Exponential fit in pandas

I have this data:我有这个数据:

puf = pd.DataFrame({'id':[1,2,3,4,5,6,7,8],
                    'val':[850,1889,3289,6083,10349,17860,28180,41236]})

The data seems to follow an exponential curve.数据似乎遵循指数曲线。 Let's see the plot:让我们看看剧情:

puf.plot('id','val')

在此处输入图像描述

I want to fit an exponential curve ( $$ y = Ae^{Bx} $$ , A times e to the B*X)and add it as a column in Pandas.我想拟合一条指数曲线( $$ y = Ae^{Bx} $$ ,A 乘以 e 到 B*X)并将其添加为 Pandas 中的一列。 Firstly I tried to log the values:首先,我尝试记录值:

puf['log_val'] = np.log(puf['val'])

And then to use Numpy to fit the equation:然后使用 Numpy 来拟合方程:

puf['fit'] = np.polyfit(puf['id'],puf['log_val'],1)

But I get an error:但我收到一个错误:

ValueError: Length of values (2) does not match length of index (8)

My expected result is the fitted values as a new column in Pandas.我的预期结果是将拟合值作为 Pandas 中的新列。 I attach an image with the column fitted values I want (in orange):我附上了我想要的列拟合值的图像(橙色):

在此处输入图像描述

I'm stuck in this code.我被困在这段代码中。 I'm not sure what I am doing wrong.我不确定我做错了什么。 How can I create a new column with my fitted values?如何使用我的拟合值创建一个新列?

Note that you asked for an exponential model yet you have the results for log-linear model.请注意,您要求使用指数模型,但您获得了对数线性模型的结果。

Check out the work below:看看下面的工作:

For log-linear, we are fitting E(log(Y)) ie log(y) - (log(b[0]) +b[1]*x) :对于对数线性,我们拟合E(log(Y))log(y) - (log(b[0]) +b[1]*x)

from scipy.optimize import least_squares
least_squares(lambda b: np.log(puf['val']) -(np.log(b[0]) + b[1] * puf['id']), 
        [1,1])['x']
 array([5.99531305e+02, 5.51106793e-01]) 

These are the values that excel gives.这些是excel给出的值。

On the other hand to fit an exponential curve, the randomness is on Y and not on its logarithm, E(Y)=b[0]*exp(b[1] *x) Hence we have:另一方面,为了拟合指数曲线,随机性在 Y 上而不是在其对数上, E(Y)=b[0]*exp(b[1] *x)因此我们有:

least_squares(lambda b: puf['val'] - b[0]*exp(b[1] * puf['id']), [0,1])['x']
array([1.08047304e+03, 4.58116127e-01]) # correct results for exponential fit

Depending on your model choice, the values are alittle different.根据您选择的型号,这些值略有不同。

Better Model?更好的模型? Since you have same number of parameters, consider the one that gives you lower deviance or better out of sample prediction由于您具有相同数量的参数,因此请考虑使您的偏差较小或样本预测更好的参数

Note that the ideal exponential model is E(Y) = A'B'^X which for comparison can be written as log(E(Y)) = A + XB while log-linear model will be E(log(Y) = A + XB . Note the difference in Expectation.请注意,理想的指数模型是E(Y) = A'B'^X比较可以写成log(E(Y)) = A + XB而对数线性模型将是E(log(Y) = A + XB . 注意期望的差异。

From the two models we have:从我们拥有的两个模型中:

在此处输入图像描述

Notice how when we go to higher numbers the log-linear overestimates.请注意,当我们使用更高的数字时,对数线性高估了。 While in the lower numbers the exponential overestimates.而在较低的数字中,指数高估了。

Code for image:图像代码:

from scipy.optimize import least_squares
log_lin = least_squares(lambda b: np.log(puf['val']) -(np.log(b[0]) + b[1] * puf['id']), 
        [1,1])['x']
expo = least_squares(lambda b: puf['val'] - b[0]*exp(b[1] * puf['id']), [0,1])['x']

exp_fun = lambda x: expo[0] * exp(expo[1]*x)
log_lin_fun = lambda x:log_lin[0] * exp(log_lin[1]*x)

plt.plot(puf.id, puf.val, label = 'original')
plt.plot(puf.id, exp_fun(puf.id), label='exponential')
plt.plot(puf.id, log_lin_fun(puf.id), label='log-linear')
plt.legend()

Your getting that error because np.polyfit(puf['id'],puf['log_val'],1) returns two values array([0.55110679, 6.39614819]) which isn't the shape of your dataframe.你得到这个错误是因为np.polyfit(puf['id'],puf['log_val'],1)返回两个值array([0.55110679, 6.39614819])这不是你的数据框的形状。

This is what you want这就是你想要的

y = a* exp (b*x) -> ln(y)=ln(a)+bx
f = np.polyfit(df['id'], np.log(df['val']), 1)

where在哪里

a = np.exp(f[1]) -> 599.5313046712091
b = f[0] -> 0.5511067934637022

Giving给予

puf['fit'] = a * np.exp(b * puf['id'])

   id    val           fit
0   1    850   1040.290193
1   2   1889   1805.082864
2   3   3289   3132.130026
3   4   6083   5434.785677
4   5  10349   9430.290286
5   6  17860  16363.179739
6   7  28180  28392.938399
7   8  41236  49266.644002

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM