R：根据数据计数生成直方图

Question

Suppose I have vector a : 假设我有矢量a ：

c(1, 6, 2, 4.1, 1, 2)

And a count vector b : 一个计数向量b ：

c(2,3,2,1,1,0)

I'd like to generate vector c : 我想生成向量c ：

c(1, 1, 6, 6, 6, 2, 2, 4.1, 1)

To call: 致电：

hist(c)

How can I build c , or is there a way to generate the histogram directly from a and b ? 我如何构建c ，或者有没有办法直接从a和b生成直方图？ Note the duplicates in a , as well as unequal spacing. 请注意，在重复a ，以及不平等的间距。

Require a vectorized solution. 需要矢量化解决方案。 a and b are too large for lapply and friends. a和b对于lapply和朋友来说太大了。

Answer 1

?rep

> rep(a, b)
[1] 1.0 1.0 6.0 6.0 6.0 2.0 2.0 4.1 1.0
>

Edit since I was curious! 编辑，因为我很好奇！

a <- sample(1:10, 1e6, replace=TRUE)
b <- sample(1:10, 1e6, replace=TRUE)

> system.time(rep(a, b))
   user  system elapsed 
  0.140   0.016   0.156 
> system.time(inverse.rle(list(lengths=b, values=a)))
   user  system elapsed 
  0.024   0.004   0.028

Answer 2

Just for something different than rep : 只是为了与rep不同的东西：

> inverse.rle(list(lengths=b,values=a))
[1] 1.0 1.0 6.0 6.0 6.0 2.0 2.0 4.1 1.0

Answer 3

Some benchmarking and a faster solution. 一些基准测试和更快的解决方案。 rep.int is a faster implementation of rep in the standard use case (from ?rep ) rep.int是标准用例中rep的更快实现（来自?rep ）

rep.int(a, b)

I wasn't convinced on the benchmarking above 我不相信上面的基准测试

inverse.rle is just a wrapper for rep.int . inverse.rle仅仅是一个包装rep.int 。 rep.int should be faster than rep . rep.int应该比rep快。 I would think that the wrapper component of inverse.rle should be slower than the interpretation of rep() as a primitive function 我认为inverse.rle的wrapper组件应该比rep()作为基本函数的解释慢

Some microbenchmarking 一些微基准测试

library(microbenchmark)

microbenchmark(rep(a,b), rep.int(a,b), 
      inverse.rle(list(values = a, lengths =b)))
Unit: milliseconds
                                        expr      min       lq   median       uq
1 inverse.rle(list(values = a, lengths = b)) 29.06968 29.26267 29.36191 29.67501
2                                  rep(a, b) 25.65125 25.76246 25.84869 26.52348
3                              rep.int(a, b) 20.38604 23.31840 23.38940 23.69600
       max
1 72.80645
2 69.00169
3 66.40759

There isn't much in it, but rep.int appears the winner - which it should. 其中没有多少，但rep.int似乎是赢家 - 它应该是。

R：根据数据计数生成直方图

问题描述

3 个解决方案

解决方案1
10 已采纳 2012-11-20 01:58:29

解决方案2
5 2012-11-20 02:34:58

解决方案3
4 2012-11-20 05:08:00

R：根据数据计数生成直方图

问题描述

3 个解决方案

解决方案1 10 已采纳 2012-11-20 01:58:29

解决方案2 5 2012-11-20 02:34:58

解决方案3 4 2012-11-20 05:08:00

解决方案1
10 已采纳 2012-11-20 01:58:29

解决方案2
5 2012-11-20 02:34:58

解决方案3
4 2012-11-20 05:08:00