[英]R: Generate histogram from counts of data
Suppose I have vector a
: 假设我有矢量a
:
c(1, 6, 2, 4.1, 1, 2)
And a count vector b
: 一个计数向量b
:
c(2,3,2,1,1,0)
I'd like to generate vector c
: 我想生成向量c
:
c(1, 1, 6, 6, 6, 2, 2, 4.1, 1)
To call: 致电:
hist(c)
How can I build c
, or is there a way to generate the histogram directly from a
and b
? 我如何构建c
,或者有没有办法直接从a
和b
生成直方图? Note the duplicates in a
, as well as unequal spacing. 请注意,在重复a
,以及不平等的间距。
Require a vectorized solution. 需要矢量化解决方案。 a
and b
are too large for lapply and friends. a
和b
对于lapply和朋友来说太大了。
?rep
> rep(a, b)
[1] 1.0 1.0 6.0 6.0 6.0 2.0 2.0 4.1 1.0
>
Edit since I was curious! 编辑,因为我很好奇!
a <- sample(1:10, 1e6, replace=TRUE)
b <- sample(1:10, 1e6, replace=TRUE)
> system.time(rep(a, b))
user system elapsed
0.140 0.016 0.156
> system.time(inverse.rle(list(lengths=b, values=a)))
user system elapsed
0.024 0.004 0.028
Just for something different than rep
: 只是为了与rep
不同的东西:
> inverse.rle(list(lengths=b,values=a))
[1] 1.0 1.0 6.0 6.0 6.0 2.0 2.0 4.1 1.0
Some benchmarking and a faster solution. 一些基准测试和更快的解决方案。 rep.int
is a faster implementation of rep
in the standard use case (from ?rep
) rep.int
是标准用例中rep
的更快实现(来自?rep
)
rep.int(a, b)
I wasn't convinced on the benchmarking above 我不相信上面的基准测试
inverse.rle
is just a wrapper for rep.int
. inverse.rle
仅仅是一个包装rep.int
。 rep.int
should be faster than rep
. rep.int
应该比rep
快。 I would think that the wrapper
component of inverse.rle
should be slower than the interpretation of rep()
as a primitive function 我认为inverse.rle
的wrapper
组件应该比rep()
作为基本函数的解释慢
Some microbenchmarking 一些微基准测试
library(microbenchmark)
microbenchmark(rep(a,b), rep.int(a,b),
inverse.rle(list(values = a, lengths =b)))
Unit: milliseconds
expr min lq median uq
1 inverse.rle(list(values = a, lengths = b)) 29.06968 29.26267 29.36191 29.67501
2 rep(a, b) 25.65125 25.76246 25.84869 26.52348
3 rep.int(a, b) 20.38604 23.31840 23.38940 23.69600
max
1 72.80645
2 69.00169
3 66.40759
There isn't much in it, but rep.int
appears the winner - which it should. 其中没有多少,但rep.int
似乎是赢家 - 它应该是。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.