简体   繁体   中英

How to fit a normal cumulative distribution function to data

I have generated some data which is effectively a cumulative distribution, the code below gives an example of X and Y from my data:

X<- c(0.09787761, 0.10745590, 0.11815422, 0.15503521, 0.16887488, 0.18361325, 0.22166727,
0.23526786, 0.24198808, 0.25432602, 0.26387961, 0.27364063, 0.34864672, 0.37734113,
0.39230736, 0.40699061, 0.41063824, 0.42497043, 0.44176913, 0.46076456, 0.47229330,
0.53134509, 0.56903577, 0.58308938, 0.58417653, 0.60061901, 0.60483849, 0.61847521,
0.62735245, 0.64337353, 0.65783302, 0.67232004, 0.68884473, 0.78846000, 0.82793293,
0.82963446, 0.84392010, 0.87090024, 0.88384044, 0.89543314, 0.93899033, 0.94781219,
1.12390279, 1.18756693, 1.25057774)

Y<- c(0.0090, 0.0210, 0.0300, 0.0420, 0.0580, 0.0700, 0.0925, 0.1015, 0.1315, 0.1435,
0.1660, 0.1750, 0.2050, 0.2450, 0.2630, 0.2930, 0.3110, 0.3350, 0.3590, 0.3770, 0.3950,
0.4175, 0.4475, 0.4715, 0.4955, 0.5180, 0.5405, 0.5725, 0.6045, 0.6345, 0.6585, 0.6825,
0.7050, 0.7230, 0.7470, 0.7650, 0.7950, 0.8130, 0.8370, 0.8770, 0.8950, 0.9250, 0.9475,
0.9775, 1.0000)

plot(X,Y)

I would like to obtain the median, mean and some quantile information (say for example 5%, 95%) from this data. The way I was thinking of doing this was to fit a defined distribution to it and then integrate to get my quantiles, mean and median values.

The question is how to fit the most appropriate cumulative distribution function to this data (I expect this may well be the Normal Cumulative Distribution Function).

I have seen lots of ways to fit a PDF but I can't find anything on fitting a CDF.

(I realise this may seem a basic question to many of you but it has me struggling!!)

Thanks in advance

Perhaps you could use nlm to find parameters that minimize the squared differences from your observed Y values and the expected for a normal distribution. Here an example using your data

fn <- function(x) {
   mu <- x[1];
   sigma <- exp(x[2])
   sum((Y-pnorm(X,mu,sigma))^2)
}
est <- nlm(fn, c(1,1))$estimate

plot(X,Y)
curve(pnorm(x, est[1], exp(est[2])), add=T)

Unfortunately I don't know an easy with with this method to constrain sigma>0 without doing the exp transformation on the variable. But the fit seems reasonable

在此输入图像描述

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM