简体   繁体   中英

Create cumulative probability density function

I have the following dataframe df in R:

      time
[1]  0.432
[2]  0.451
[3]  0.399
[4]  0.422
...
[25] 0.444

Now, I would like to add a column to this dataframe (let's call it timep ) of which the elements are calculated by the following formula:

The item on row i in column timep should be equal to: the number of elements in column time that are smaller or equal than the item in column time on row i , divided by the number of rows of the dataframe.

In pseudocode: df$timep[i] <- count(df$time <= df$time[i])/length(df)

Only, I don't really know how I can correctly express this in R.

R has a built-in empirical cdf ecdf .

Let's say you have a dataframe df :

df <- data.frame(time = c(0.432, 0.451, 0.399, 0.422, 0.444))

You can create an empirical cdf with:

P <- ecdf(df$time)

Now, if you pass a value to P, it will return the cumulative probabilty for that value:

df$cdf <- P(df$time)

Out:

   time cdf
1 0.432 0.6
2 0.451 1.0
3 0.399 0.2
4 0.422 0.4
5 0.444 0.8

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM