使用（python）Scipy擬合帕累托分布

Question

我有一個我知道有帕累托分布的數據集。 有人能指出我如何在 Scipy 中擬合這個數據集嗎？ 我運行了下面的代碼，但我不知道返回給我的是什么（a、b、c）。 另外，在獲得 a,b,c 后，我如何使用它們計算方差？

import scipy.stats as ss 
import scipy as sp

a,b,c=ss.pareto.fit(data)

Answer 1

非常小心地擬合冪律！！ 許多報道的冪律實際上不符合冪律。 見克勞塞特等人。 有關所有詳細信息（如果您無法訪問該期刊，也可以在arxiv上查看）。 他們有一個文章的配套網站，現在鏈接到 Python 實現。 不知道它是否使用 Scipy，因為我上次使用它時使用了他們的 R 實現。

Answer 2

這是一個快速編寫的版本，從 Rupert 提供的參考頁面中獲取了一些提示。 這目前正在 scipy 和 statsmodels 中進行，並且需要具有一些固定或凍結參數的 MLE，這僅在主干版本中可用。 目前還沒有關於參數估計或其他結果統計的標准誤差。

'''estimating pareto with 3 parameters (shape, loc, scale) with nested
minimization, MLE inside minimizing Kolmogorov-Smirnov statistic

running some examples looks good
Author: josef-pktd
'''

import numpy as np
from scipy import stats, optimize
#the following adds my frozen fit method to the distributions
#scipy trunk also has a fit method with some parameters fixed.
import scikits.statsmodels.sandbox.stats.distributions_patch

true = (0.5, 10, 1.)   # try different values
shape, loc, scale = true
rvs = stats.pareto.rvs(shape, loc=loc, scale=scale, size=1000)

rvsmin = rvs.min() #for starting value to fmin


def pareto_ks(loc, rvs):
    est = stats.pareto.fit_fr(rvs, 1., frozen=[np.nan, loc, np.nan])
    args = (est[0], loc, est[1])
    return stats.kstest(rvs,'pareto',args)[0]

locest = optimize.fmin(pareto_ks, rvsmin*0.7, (rvs,))
est = stats.pareto.fit_fr(rvs, 1., frozen=[np.nan, locest, np.nan])
args = (est[0], locest[0], est[1])
print 'estimate'
print args
print 'kstest'
print stats.kstest(rvs,'pareto',args)
print 'estimation error', args - np.array(true)

Answer 3

在將數據傳遞給 OPENTURNS 中的 build() 函數之前，請確保以這種方式進行轉換：

data = [[i] for i in data]

因為 Sample() 函數可能會返回錯誤。

僅供參考@Tropilio

Answer 4

假設您的數據格式如下

import openturns as ot
data = [
    [2.7018013],
    [8.53280352],
    [1.15643882],
    [1.03359467],
    [1.53152735],
    [32.70434285],
    [12.60709624],
    [2.012235],
    [1.06747063],
    [1.41394096],
]
sample = ot.Sample([[v] for v in data])

您可以使用很容易地適應一個帕累托分布ParetoFactory OpenTURNS庫：

distribution = ot.ParetoFactory().build(sample)

你當然可以打印它：

print(distribution)
>>> Pareto(beta = 0.00317985, alpha=0.147365, gamma=1.0283)

或繪制其 PDF：

from openturns.viewer import View

pdf_graph = distribution.drawPDF()
pdf_graph.setTitle(str(distribution))
View(pdf_graph, add_legend=False)

文檔中提供了有關ParetoFactory 的更多詳細信息。

使用（python）Scipy擬合帕累托分布

問題描述

4 個解決方案

解決方案1
6 2010-07-14 10:37:02

解決方案2
5 2010-07-14 23:52:31

解決方案3
1 2021-05-17 17:33:09

解決方案4
0 2020-11-07 00:19:08

使用（python）Scipy擬合帕累托分布

問題描述

4 個解決方案

解決方案1 6 2010-07-14 10:37:02

解決方案2 5 2010-07-14 23:52:31

解決方案3 1 2021-05-17 17:33:09

解決方案4 0 2020-11-07 00:19:08

解決方案1
6 2010-07-14 10:37:02

解決方案2
5 2010-07-14 23:52:31

解決方案3
1 2021-05-17 17:33:09

解決方案4
0 2020-11-07 00:19:08