R 和 Matlab 中的擬合分布給出了非常不同的結果

Question

我正在將波高數據集 (m) 擬合到廣義帕累托分布。 我已經在 0.15m 處選擇了一個閾值。 我正在從 Matlab 過渡到 R，所以我想比較兩個程序的結果。 我知道這兩個程序不會提供相同的確切答案，但我希望得到的結果彼此有些接近。

當然，對於這兩個配件，我使用了相同的數據和相同的閾值。 這兩種方法都通過 MLE 估計參數。 在 Matlab 中，fitdist 函數不估計閾值。 假設它是已知的（在這種情況下是 0.15m），並在調用它之前從數據中減去（我已經這樣做了）。 而在 R 中，閾值是輸入的一部分。

我使用的函數是： Matlab: fitdist(data_above_threshold,'Generalized Pareto') R: fevd(data,threshold =0.15,type = 'GP',method = "MLE") 。 (extRemes 包)

在 Matlab 中我得到： Shape:-1.0301 Scale:0.7534而在 RI 中得到Shape:-0.5848505 Scale:0.3620191

我在這里附上數據，以便您可以自己查看結果：

c(0.194101337780203, 0.289791483274648, 0.313940773547535, 0.184577674010614, 
0.102266008573448, 0.045464156804826, 0.0486387113946889, 0.143761972140945, 
0.267342847246331, 0.242966803074167, 0.069386693178437, 0.0099771715681416, 
0.00736950172646855, 0.056121590070795, 0.0759625562574393, 0.145009118586962, 
0.258045937376017, 0.379926158236833, 0.236844447793717, 0.0861664817248564, 
0.0503393656392586, 0.156233436601121, 0.118138781522764, 4.44089209850063e-16, 
4.44089209850063e-16, 4.44089209850063e-16, 4.44089209850063e-16, 
0.0442170103588082, 4.44089209850063e-16, 4.44089209850063e-16, 
4.44089209850063e-16, 4.44089209850063e-16, 0.0143988726040223, 
0.148183673176825, 0.197729400168618, 0.0143988726040223, 4.44089209850063e-16, 
4.44089209850063e-16, 4.44089209850063e-16, 0.026416829265647, 
0.133558046673528, 0.209180472082052, 0.236050809146251, 0.0976175536382913, 
0.0292512530065965, 0.101812500774896, 0.174260371593558, 0.219271020599832, 
0.463144839271102, 0.579809720448572, 0.533211794147367, 0.354756475417204, 
0.360085192050189, 0.632756755929504, 0.651577329569406, 0.413372358380034, 
0.39353139219339, 0.343985665201597, 0.0630375839987112, 4.44089209850063e-16, 
4.44089209850063e-16, 4.44089209850063e-16, 4.44089209850063e-16, 
4.44089209850063e-16, 0.0486387113946889, 0.0434233717113424, 
4.44089209850063e-16, 4.44089209850063e-16, 4.44089209850063e-16, 
0.0566884748189849, 0.147730165378273, 0.175734271938852, 0.0827651732357175, 
4.44089209850063e-16, 4.44089209850063e-16, 4.44089209850063e-16, 
4.44089209850063e-16, 4.44089209850063e-16, 4.44089209850063e-16, 
0.0385481628769098, 0.0378679011790819, 0.0463711724019298, 0.0200677200859207, 
4.44089209850063e-16, 4.44089209850063e-16, 4.44089209850063e-16, 
4.44089209850063e-16, 4.44089209850063e-16, 0.0956901454944461, 
0.128909591738371, 0.141154302299272, 0.0459176646033779, 4.44089209850063e-16, 
0.0285709913087686, 0.251696828196291, 0.642847304447283, 0.731394702114536, 
0.603732256822183, 0.427997984883332, 0.635251048821539)

所以，我的問題是：如果兩者都使用 MLE 來估計參數，那么為什么結果如此不同？ 那么哪種貼合更可靠，或者您建議使用哪一種？

Answer 1

我認為問題在於您的數據暗示了某種多峰性，而 Matlab 使用的是次優起始值。

數據的經驗密度圖：

plot(density(x, bw = "SJ"))

首先讓我們繪制 R 擬合：

library(SpatialExtremes)
library(extRemes)
plot(fevd(x,threshold =0.15,type = 'GP',method = "MLE"))

左下圖顯示我們已經實現了合適的擬合。 （請注意，此圖中的核密度估計使用了與上述不同的帶寬。）

現在我們使用您的 Matlab 結果作為起始值：

fevd(x,threshold =0.15,type = 'GP',method = "MLE", 
     initial = list(shape = -1.0301, scale = 0.7534))
#fevd(x = x, threshold = 0.15, type = "GP", method = "MLE", 
#    initial = list(shape = -1.0301, scale = 0.7534))
#
#[1] "Estimation Method used: MLE"
#
#
# Negative Log-Likelihood Value:  -18.74149 
#
#
# Estimated parameters:
#     scale      shape 
# 0.6242056 -1.0736348 
#
# Standard Error Estimates:
#      scale       shape 
#0.000000020 0.001368606 
#
# Estimated parameter covariance matrix.
#              scale         shape
#scale  4.000000e-16 -2.598691e-17
#shape -2.598691e-17  1.873082e-06
#
# AIC = -33.48297 
#
# BIC = -30.5515 
#Warning messages:
#1: In log(z) : NaNs produced
#2: In log(z) : NaNs produced
plot(fevd(x,threshold =0.15,type = 'GP',method = "MLE", 
          initial = list(shape = -1.0301, scale = 0.7534)))
#Warning messages:
#1: In log(z) : NaNs produced
#2: In log(z) : NaNs produced

我們可以清楚地看到擬合更差，我們有一個錯誤的收斂或局部最小值問題。 這些警告進一步降低了我們對這種配合的信心。 我建議您嘗試使用不同起始值的 Matlab。

R 和 Matlab 中的擬合分布給出了非常不同的結果

問題描述

1 個解決方案

解決方案1
1 2020-09-16 14:30:57

R 和 Matlab 中的擬合分布給出了非常不同的結果

問題描述

1 個解決方案

解決方案1 1 2020-09-16 14:30:57

解決方案1
1 2020-09-16 14:30:57