简体   繁体   中英

Replicating Values in R without a Loop

I have two vectors. The first is my data (b). The second is an index range from which the data repeats (a) in that range.

> b
 [1] 213488 204506 246738 298035 370680 377635 404552 477359 310586 383221 486429 482295 438255 411939 268882

> a
 [1]  214  466  718  968 1221 1473 1724 1977 2228 2479 2732 2983 3235 3487 3738

I want the first element in the b vector (213488) to repeat from 1 to 214 and then the second element in the b vector (204506) to repeat from 215-466 and so on an so forth. The last element in the b vector (268882) will go from 3738 to 5000.

Is there an easy way to do this without a loop?

Do this:

b = c(213488,204506, 246738, 298035, 370680, 377635, 404552, 477359, 310586, 383221, 486429, 482295, 438255, 411939, 268882)
a = c(214,  466,  718,  968, 1221, 1473, 1724, 1977, 2228, 2479, 2732, 2983, 3235, 3487, 3738)

c = diff(a)
d = c(a[1],c)

rep(b,d)

With diff you get how many times you want to repeat each element, but you lose how many times you want to repeat the first, so add the first element of a.
Once done, you need to use rep()

Example

b = c(1,2,3,4)
a = c(3,8,10,15)
c = diff(a)
d = c(a[1],c)
rep(b,d)
 [1] 1 1 1 2 2 2 2 2 3 3 4 4 4 4 4
> 

The run length encoding functions rle and inverse.rle will likely be useful for this kind of data. Borrowing from R. Schifini's answer you can create an rle object with

x = list( values=b, lengths=d ) class(x) = "rle" inverse.rle(x)

Also, Bioconductor's S4Vectors::Rle class stores this type of data and allows all of the vector operations while keeping the data in this compressed form.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM