简体   繁体   中英

R + ggplot2 - Aggregate Data by Intervals

I have a file where in each line i have a numeric value symbolizing an average duration:

12.3
5.4
6
...

There is some way in R to display the data in automatic or manual intervals/breaks (aggregate?). Something like this:

[0,1[ 0
[1, 6[ 1
[6, 20[ 2
...

Also, next i want to produce a plot in ggplot2 showing this data. Could i use these intervals as labels?

You can bin data with the cut() function in base R or use the Hmisc package and cut2() . There are several options on how to go about cutting and slicing your data, all of which are documented in help(cut) or help(cut2) respectively.

Once you have binned your data appropriately, plotting with ggplot becomes a trivial exercise:

library(ggplot2)
#Sample data
set.seed(1)
dat <- data.frame(x = sample(1:100, 1000, TRUE))
dat$cuts <- cut(dat$x, breaks = 5)

#Make bar chart
qplot(dat$cuts)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM