简体   繁体   中英

How can I plot a cumulative distribution function (CDF) for binned data?

I've got discrete data which i presented in ranges for example

         Marks Freq cumFreq 
1  (37.9,43.1]    4       4    
2  (43.1,48.2]   16      20   
3  (48.2,53.3]   76      96    

i need to plot the cmf for this data, I know that there is

plot(ecdf(x))

but i don't what to add for it to have what I need.

Here are a few options:

library(ggplot2)
library(scales)
library(dplyr)

## Fake data
set.seed(2)
dat = data.frame(score=c(rnorm(130,40,10), rnorm(130,80,5)))

Here's how to plot the ECDF if you have the raw data:

# Base graphics
plot(ecdf(dat$score))

# ggplot2
ggplot(dat, aes(score)) +
  stat_ecdf(aes(group=1), geom="step")

Here's one way to plot the ECDF if you have only summary data:

First, let's group the data into bins, similar to what you have in your question. We use the cut function to create the bins and then create a new pct column to calculate each bins fraction of the total number of scores. We use the dplyr chaining operator ( %>% ) to do it all in one "chain" of functions.

dat.binned = dat %>% count(Marks=cut(score,seq(0,100,5))) %>%
         mutate(pct = n/sum(n))

Now we can plot it. cumsum(pct) calculates the cumulative percentages (like cumFreq in your question). geom_step creates step plot with these cumulative percentages.

ggplot(dat.binned, aes(Marks, cumsum(pct))) +
  geom_step(aes(group=1)) +
  scale_y_continuous(labels=percent_format()) 

Here's what the plots look like:

在此处输入图片说明

在此处输入图片说明

在此处输入图片说明

What about this:

library(ggplot2)
library(scales)
library(dplyr)

set.seed(2)
dat = data.frame(score = c(rnorm(130,40,10), rnorm(130,80,5)))
dat.binned = dat %>% count(Marks = cut(score,seq(0,100,5))) %>%
         mutate(pct = n/sum(n))
ggplot(data = dat.binned, mapping = aes(Marks, cumsum(pct))) +  
  geom_line(aes(group = 1)) + 
  geom_point(data = dat.binned, size = 0.1, color = "blue") +
  labs(x = "Frequency(Hz)", y = "Axis") +
  scale_y_continuous(labels = percent_format()) 

在此处输入图片说明

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM