帕累托分布：R 与 Python - 不同的结果

Question

我正在尝试使用scipy.stats在 Python 中复制 R 的 fitdist() 结果（参考，无法修改 R 代码）。 结果完全不同。 有谁知道为什么？ 如何在 Python 中复制 R 的结果？

data = [2457.145, 1399.034, 20000.0, 476743.9, 24059.6, 28862.8]

R 代码：

library(fitdistrplus)
library(actuar)

fitdist(data, 'pareto', "mle")$estimate

R 结果：

       shape        scale 
    0.760164 10066.274196

Python 代码

st.pareto.fit(data, floc=0, scale=1)

Python 结果

(0.4019785013487883, 0, 1399.0339889072732)

Answer 1

差异主要是由于pdf不同。

Python

在 python st.pareto.fit()使用通过此 pdf 定义的帕累托分布：

import scipy.stats as st
data = [2457.145, 1399.034, 20000.0, 476743.9, 24059.6, 28862.8]
print(st.pareto.fit(data, floc = 0, scale = 1))

# (0.4019785013487883, 0, 1399.0339889072732)

R

而您的 R 代码使用 Pareto 与此 pdf：

library(fitdistrplus)
library(actuar)
data <- c(2457.145, 1399.034, 20000.0, 476743.9, 24059.6, 28862.8)
fitdist(data, 'pareto', "mle")$estimate

#    shape        scale 
#    0.760164 10066.274196

制作 R 镜像 Python

要使 R 使用与st.pareto.fit()相同的分布，请使用actuar::dpareto1() ：

library(fitdistrplus)
library(actuar)
data <- c(2457.145, 1399.034, 20000.0, 476743.9, 24059.6, 28862.8)
fitdist(data, 'pareto1', "mle")$estimate

#     shape          min 
#   0.4028921 1399.0284977

制作 Python 镜像 R

这是一种近似 Python 中的 R 代码的方法：

import numpy as np
from scipy.optimize import minimize

def dpareto(x, shape, scale):
    return shape * scale**shape / (x + scale)**(shape + 1)

def negloglik(x):
    data = [2457.145, 1399.034, 20000.0, 476743.9, 24059.6, 28862.8]
    return -np.sum([np.log(dpareto(i, x[0], x[1])) for i in data])

res = minimize(negloglik, (1, 1), method='Nelder-Mead', tol=2.220446e-16)
print(res.x)

# [7.60082820e-01 1.00691719e+04]

帕累托分布：R 与 Python - 不同的结果

问题描述

1 个解决方案

解决方案1
2 已采纳 2021-01-18 23:25:50

Python

R

制作 R 镜像 Python

制作 Python 镜像 R

帕累托分布：R 与 Python - 不同的结果

问题描述

1 个解决方案

解决方案1 2 已采纳 2021-01-18 23:25:50

Python

R

制作 R 镜像 Python

制作 Python 镜像 R

解决方案1
2 已采纳 2021-01-18 23:25:50