[英]R + ggplot2 - Aggregate Data by Intervals
I have a file where in each line i have a numeric value symbolizing an average duration: 我有一个文件,在每行中我有一个数字值,表示平均持续时间:
12.3
5.4
6
...
There is some way in R to display the data in automatic or manual intervals/breaks (aggregate?). R中有一些方法可以自动或手动间隔/中断(聚合?)显示数据。 Something like this:
像这样的东西:
[0,1[ 0
[1, 6[ 1
[6, 20[ 2
...
Also, next i want to produce a plot in ggplot2 showing this data. 另外,接下来我想在ggplot2中生成一个显示此数据的图。 Could i use these intervals as labels?
我可以将这些间隔用作标签吗?
You can bin data with the cut()
function in base R or use the Hmisc package and cut2()
. 您可以使用基数R中的
cut()
函数来存储数据,也可以使用Hmisc包和cut2()
。 There are several options on how to go about cutting and slicing your data, all of which are documented in help(cut)
or help(cut2)
respectively. 关于如何切割和切片数据有几种选择,所有这些选项分别记录在
help(cut)
或help(cut2)
。
Once you have binned your data appropriately, plotting with ggplot becomes a trivial exercise: 一旦你适当地分类了你的数据,用ggplot绘图变成了一个微不足道的练习:
library(ggplot2)
#Sample data
set.seed(1)
dat <- data.frame(x = sample(1:100, 1000, TRUE))
dat$cuts <- cut(dat$x, breaks = 5)
#Make bar chart
qplot(dat$cuts)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.