简体   繁体   English

R + ggplot2 - 按间隔汇总数据

[英]R + ggplot2 - Aggregate Data by Intervals

I have a file where in each line i have a numeric value symbolizing an average duration: 我有一个文件,在每行中我有一个数字值,表示平均持续时间:

12.3
5.4
6
...

There is some way in R to display the data in automatic or manual intervals/breaks (aggregate?). R中有一些方法可以自动或手动间隔/中断(聚合?)显示数据。 Something like this: 像这样的东西:

[0,1[ 0
[1, 6[ 1
[6, 20[ 2
...

Also, next i want to produce a plot in ggplot2 showing this data. 另外,接下来我想在ggplot2中生成一个显示此数据的图。 Could i use these intervals as labels? 我可以将这些间隔用作标签吗?

You can bin data with the cut() function in base R or use the Hmisc package and cut2() . 您可以使用基数R中的cut()函数来存储数据,也可以使用Hmisc包和cut2() There are several options on how to go about cutting and slicing your data, all of which are documented in help(cut) or help(cut2) respectively. 关于如何切割和切片数据有几种选择,所有这些选项分别记录在help(cut)help(cut2)

Once you have binned your data appropriately, plotting with ggplot becomes a trivial exercise: 一旦你适当地分类了你的数据,用ggplot绘图变成了一个微不足道的练习:

library(ggplot2)
#Sample data
set.seed(1)
dat <- data.frame(x = sample(1:100, 1000, TRUE))
dat$cuts <- cut(dat$x, breaks = 5)

#Make bar chart
qplot(dat$cuts)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM