简体   繁体   中英

Python - Fitting a GEV distribution from these values

I'm very new with Python and I've looked around on the internet, but couldn't find anything logic that could help me with my problem.

I have values of precipitation in a graph, and now I need to fit a GEV distribution from these values in a graph. Each value equals the maximum value of a year, starting from 1974 to 2017 (so there are 43 values total).

These are the values:

max_precip = [9.4, 38.0, 12.5, 35.3, 17.6, 12.9, 12.4, 19.6, 15.0, 13.2, 12.3, 16.9, 16.9, 29.4, 13.6, 11.1, 8.0, 16.6, 12.0, 13.1, 9.1, 9.7, 21.0, 11.2, 14.4, 18.8, 14.0, 19.9, 12.4, 10.8, 21.6, 15.4, 17.4, 14.8, 22.7, 11.5, 10.5, 11.8, 12.4, 16.6, 11.7, 12.9, 17.8]

I found that I need to use gev.fit, so I thought using the following:

t = np.linspace(1,43,43)
fit = gev.fit(max_precip,loc=3)
pdf = gev.pdf(t, *fit)
plt.plot(t,pdf)
plt.plot(t, max_precip, "o")

But this only prints the points of max_precip in a graph and not the GEV distribution.

Can someone help me? Sorry if this question is already asked, I couldnt find anything like it.

I used these imports:

import csv
import matplotlib.pyplot as plt
import numpy as np

from dateutil.rrule import rrule, YEARLY
import datetime
from matplotlib.dates import DateFormatter
from scipy.stats import genextreme as gev
from scipy.stats import genpareto as gpd
from scipy.optimize import minimize

I've tried to fit your data

import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import genextreme as gev

def main(rvs):
    shape, loc, scale = gev.fit(rvs)
    return shape, loc, scale

if __name__ == '__main__':
    rvs = [9.4, 38.0, 12.5, 35.3, 17.6, 12.9, 12.4, 19.6, 15.0, 13.2, 12.3, 16.9, 16.9, 29.4, 13.6, 11.1, 8.0, 16.6, 12.0, 13.1, 9.1, 9.7, 21.0, 11.2, 14.4, 18.8, 14.0, 19.9, 12.4, 10.8, 21.6, 15.4, 17.4, 14.8, 22.7, 11.5, 10.5, 11.8, 12.4, 16.6, 11.7, 12.9, 17.8]

    shape, loc, scale = main(rvs)

    print(shape)
    print(loc)
    print(scale)

    l = loc + scale / shape

    xx = np.linspace(l+0.00001, l+0.00001+35, num=71)
    yy = gev.pdf(xx, shape, loc, scale)

    hist, bins = np.histogram(rvs, bins=12, range=(-0.5, 23.5), density=True)
    plt.bar(bins[:-1], hist, width = 2, align='edge')

    plt.plot(xx, yy, 'ro')
    plt.show()

but what I've got back are

-0.21989526255575445
12.749780017954315
3.449061347316184

for shape , loc and scale . If you look at GEV distribution as defined in scipy , when shape is negative, the valid interval is [loc + scale/shape...+infinity]. I've computed latter value and it is equal to

-2.935417290135696

should work...

Python3, Anaconda, scipy 1.1, Windows 10 64bit

UPDATE

Ok, I've updated the code and added plotting, looks somewhat reasonable. Is it what are you looking for? Basically, trick is to histogram it and plot density bins overlapping with PDF

在此处输入图片说明

Out of curiosity, I tried the GeneralizedExtremeValue Factory (GEV) available in OpenTURNS

import openturns as ot
sample = ot.Sample([[p] for p in max_precip])

gev = ot.GeneralizedExtremeValueFactory().buildAsGeneralizedExtremeValue(sample)
print (gev)
>>> GeneralizedExtremeValue(mu=12.7497, sigma=3.44903, xi=0.219894) 

I can confirm it gives the same result.

在此处输入图片说明

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM