如何使用 Scipy 拟合对数正态分布？

Question

I want to fit the log-normal parameters mu and sigma to an existing (measured) log-normal distribution.我想将对数正态参数mu和sigma拟合到现有（测量的）对数正态分布。

The measured log-normal distribution is defined by the following x and y arrays:测量的对数正态分布由以下x和y数组定义：

x:
4.870000000000000760e-09
5.620000000000000859e-09
6.490000000000000543e-09
7.500000000000000984e-09
8.660000000000001114e-09
1.000000000000000021e-08
1.155000000000000085e-08
1.334000000000000067e-08
1.540000000000000224e-08
1.778000000000000105e-08
2.054000000000000062e-08
2.371000000000000188e-08
2.738000000000000099e-08
3.162000000000000124e-08
3.652000000000000541e-08
4.217000000000000637e-08
4.870000000000000595e-08
5.623000000000000125e-08
6.493999999999999784e-08
7.498999999999999850e-08
8.659999999999999460e-08
1.000000000000000087e-07
1.154800000000000123e-07
1.333500000000000129e-07
1.539900000000000177e-07
1.778300000000000247e-07
2.053499999999999958e-07
2.371399999999999913e-07
2.738399999999999692e-07
3.162300000000000199e-07
3.651700000000000333e-07
4.217000000000000240e-07
4.869700000000000784e-07
8.659600000000001124e-07
1.000000000000000167e-06


y:
1.883186407957446899e+11
3.609524622222222290e+11
7.508596384507042236e+11
2.226776878843930664e+12
4.845941940346821289e+12
7.979258430057803711e+12
1.101088735028901758e+13
1.346205871213872852e+13
1.509035024739884375e+13
1.599175638381502930e+13
1.668097844161849805e+13
1.786208191445086719e+13
2.007139089017341016e+13
2.346096336416185156e+13
2.763042850867051953e+13
3.177726578034682031e+13
3.552045143352600781e+13
3.858765218497110156e+13
4.051697248554913281e+13
4.132681209248554688e+13
4.112713068208092188e+13
4.003871248554913281e+13
3.797625966473988281e+13
3.472541513294797656e+13
3.017757826589595312e+13
2.454670317919075000e+13
1.840085110982658984e+13
1.250047161156069336e+13
7.540309609248554688e+12
3.912091102658959473e+12
1.632974141040462402e+12
4.585002890867052002e+11
1.260128910303030243e+11
7.276263267445255280e+09
1.120399584203921509e+10

Plotted this looks like this:绘制这个看起来像这样：

When I now use scipy.stats.lognorm.fit like this:当我现在像这样使用scipy.stats.lognorm.fit ：

shape, loc, scale = stats.lognorm.fit(y, floc=0)
mu = np.log(scale)
sigma = shape

y_fit = 1 / x * 1 / (sigma * np.sqrt(2*np.pi)) * np.exp(-(np.log(x)-mu)**2/(2*sigma**2))

The resulting y_fit looks like this:生成的y_fit如下所示：

2.774453764650559735e-92
9.215468156399056736e-92
3.066511893903929907e-91
1.022335884325557513e-90
3.371353425505715432e-90
1.107869289600567113e-89
3.632923945686527959e-89
1.186352074527947499e-88
3.843439346384186221e-88
1.241282395050092616e-87
4.012158206798217088e-87
1.283531486148302474e-86
4.102813367932395623e-86
1.306865297124819703e-85
4.149188517768147925e-85
1.309743071360157226e-84
4.121819150664498056e-84
1.289935574540856462e-83
4.028475776631639341e-83
1.251854680594688466e-82
3.876254948575364474e-82
1.194751160823721531e-81
3.669411018320463915e-81
1.122061051084741563e-80
3.418224619543735425e-80
1.037398725542414359e-79
3.134554301786779178e-79
9.436770981828214504e-79
2.828745744939237710e-78
8.447588129217592353e-78
2.512030904806250195e-77
7.442222461482558402e-77
2.195666296758331429e-76
1.598228276801569301e-74
4.622033883255558750e-74

And is obliviously very far away from the original y values.并且显然与原始y值相距甚远。 I do realize that I haven't used the initial x values at all.我确实意识到我根本没有使用初始x值。 So I assume I need to shift (and maybe also scale) the resulting distribution somehow.所以我假设我需要以某种方式改变（也可能是缩放）结果分布。

However I can't wrap my head around how I need to do this.但是，我无法理解我需要如何执行此操作。 How do I correctly fit a log-normal distribution in Python?如何在 Python 中正确拟合对数正态分布？

Answer 1

It works out of the box with curve_fit if you scale the data.如果您缩放数据，它可以使用curve_fit开箱即用。 I am not sure if scaling and re-scaling makes sense, though.不过，我不确定缩放和重新缩放是否有意义。 ( this seems to confirm the ansatz ) （这似乎证实了 ansatz ）

import matplotlib.pyplot as plt
import numpy as np
from scipy.optimize import curve_fit

def log_fit( x, a, mu, sigma ):
    return a / x * 1. / (sigma * np.sqrt( 2. * np.pi ) ) * np.exp( -( np.log( x ) - mu )**2 / ( 2. * sigma**2 ) )

pp = np.argmax( y )

yM = y[ pp ]
xM = x[ pp ]

xR = x/xM
yR = y/yM
print xM, yM
sol, err = curve_fit( log_fit, xR, yR )
print sol
scaledSol = [ yM * sol[0] * xM , sol[1] + np.log(xM), sol[2] ]
print scaledSol
yF = np.fromiter( ( log_fit( xx, *sol ) for xx in xR ), np.float )
yFIR = np.fromiter( (  log_fit( xx, *scaledSol ) for xx in x ), np.float )

fig = plt.figure()
ax = fig.add_subplot( 2,1, 1)
bx = fig.add_subplot( 2,1, 2)
ax.plot( x, y )
ax.plot( x, yFIR )
bx.plot( xR, yR )
bx.plot( xR, yF )
plt.show()

Providing提供

>> 7.499e-08 41326812092485.55
>> [2.93003525 0.68436895 0.87481153]
>> [9080465.32138486, -15.72154211628693, 0.8748115349982701]

and和

Anyhow, does not really look like that's the fit function.无论如何，看起来不像是 fit 函数。

Answer 2

My equation search turned up a log-normal shifted type equation giving a good fit to "y = a * exp(-0.5 * ((log(xd)-b)/c)**2)" with parameters我的方程搜索找到了一个对数正态移位类型方程，它非常适合带有参数的“y = a * exp(-0.5 * ((log(xd)-b)/c)**2)”

a =  4.2503194887395930E+13
b = -1.6090252935097830E+01
c =  6.0250205607650253E-01
d = -2.2907054835882373E-08

No scaling needed.无需缩放。

如何使用 Scipy 拟合对数正态分布？

问题描述

2 个解决方案

解决方案1
1 已采纳 2019-04-02 15:50:01

解决方案2
1 2019-04-02 16:34:14

如何使用 Scipy 拟合对数正态分布？

问题描述

2 个解决方案

解决方案1 1 已采纳 2019-04-02 15:50:01

解决方案2 1 2019-04-02 16:34:14

解决方案1
1 已采纳 2019-04-02 15:50:01

解决方案2
1 2019-04-02 16:34:14