简体   繁体   中英

Generate running combinations of vector values in R

What I need to achieve is basically a list of all combinations of vector values but running trough windows of a given length. It's more easier to show than to explain.

Let say I have a window.size of 3

vector <- c("goofy", "mickey", "donald", "foo", "bar")

This is what I need as output

from  |  to
------+-----
goofy | mickey
goofy | donald
mickey| donald
mickey| foo
donald| bar
donald| foo
foo   | bar

As this is going to end in a monte carlo set of iterations, windows.size should be parametric

I think it could be easily done with dplyr and tidyr but I was not able to figure out how.

Thanks in advance!

With rollapply and dplyr . The c , do.call , as.data.frame ugliness are needed to convert the output of combn to a dataframe for dplyr functions:

library(zoo)
library(dplyr)

rollapply(vector, 3, combn, 2, simplify = FALSE) %>%
  c() %>%
  do.call(rbind, .) %>%
  as.data.frame() %>%
  distinct() %>%
  setNames(c("from", "to"))

Result:

    from     to
1  goofy mickey
2 mickey donald
3 donald    foo
4  goofy donald
5 mickey    foo
6 donald    bar
7    foo    bar

You could play around with the indices logic and the subsetting to make a generalisable form of:

data.frame(
  from = vector[c(rep(1:3, each = 2), 4)],
  to = vector[c(2, rep(3:5, each = 2))]
)

    from     to
1  goofy mickey
2  goofy donald
3 mickey donald
4 mickey    foo
5 donald    foo
6 donald    bar
7    foo    bar

Where the original vector is: c("goofy", "mickey", "donald", "foo", "bar") .

EDIT

A bit more gerneralisable:

n <- length(vector)
data.frame(
  from = vector[rep(1:(n-1), each = 2)[-2*n + 2]],
  to = vector[rep(2:n, each = 2)[-1]]
)

You could use lead from the dplyr package.

library(dplyr)
## Example of n = 2
n = 2
res = data.frame()
for(i in 1:n){res = na.omit(rbind(res,cbind(v,lead(v,i))))}
names(res) = c("from","to")
res
    from     to
1  goofy mickey
2 mickey donald
3 donald    foo
4    foo    bar
5  goofy donald
6 mickey    foo
7 donald    bar

## Example of n = 4
n = 4
res = data.frame()
for(i in 1:n){res = na.omit(rbind(res,cbind(v,lead(v,i))))}
names(res) = c("from","to")
res
     from     to
1   goofy mickey
2  mickey donald
3  donald    foo
4     foo    bar
5   goofy donald
6  mickey    foo
7  donald    bar
8   goofy    foo
9  mickey    bar
10  goofy    bar

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM