I am trying to use excel to create a CDF. I have tried a few different methods but can't seem to get the plot correct. I have data similar to the below:
Decision Score
Yes 5
Yes 5
Yes 4
Yes 4
Yes 4
Yes 3
No 2
No 2
No 1
I want to be able to create the CDF so that I can say something like "40% of yeses had a score of at least 4."
I believe you want something called an empirical distribution function (ECDF). This website describes how to create one in Excel.
I have copied the text in case the link stops working. It says:
Let's say you had 50 observations from an experiment. To create the EDF:
Step 1: Enter your data into column A of a spreadsheet. Sort into ascending order (smallest to greatest).
Step 2: In column B, type
k/n
, where:
k
is the numbered observation (this is easy, it's just 1, 2, 3, 4, 5…)
n
is the number in your sample. For this example, I have 50 observations, so I entered 1/50.For this example, my entries are 1/50, 2/50, 3/50, 4/50, 6/50, 7/50, 8/50, 9/50, 10/50, 11/50, etc...
To plot the EDF simply use column A for x-axis data and column B for y-axis data.
In your case it would be something like this:
Decision Score EDF
Yes 3 0.1666667
Yes 4 0.3333333
Yes 4 0.5
Yes 4 0.6666667
Yes 5 0.8333333
Yes 5 1.0
No 2
No 2
No 1
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.