简体   繁体   中英

Compiling a vector from a recursive function in R

I'm trying to understand how scoping works within recursive functions in R.

The context is this function that is supposed to return all unique combinations of a vector's elements. (The details of the exact desired output aren't really relevant here.)

perm <- function(x) {
    n <- length(x)
    if (n == 1) {
        print(x)
    } else {
        y <- NULL
        for (i in 1:n) {
            y <- paste(x[i], perm(x[-i]), sep = "_")
        }
        print(y)
    }
}

When I print the objects that I know I want to be returned (here, print(x) and print(y) ), I get the correct values returned to the console just as print output:

perm(c("a","b","c"))

However, when I try to collect these in a vector, the resulting vector contains many orders of magnitude more elements than were printed. I suspect this is something to do with the recursion, but it seems strange given that the print functions are only being triggered sensibly. For example, and using a global variable to keep track of the output to ignore any scoping issues:

out <- c()
perm <- function(x) {
    n <- length(x)
    if (n == 1) {
        assign('out', c(out, x), envir = .GlobalEnv)
    } else {
        y <- NULL
        for (i in 1:n) {
            y <- paste(x[i], perm(x[-i]), sep = "_")
        }
        assign('out', c(out, y), envir = .GlobalEnv)
    }
}

perm(c("a","b","c"))
out

The first example prints only ten values, whereas in the second example, out is of length 56 and contains values that are not found in the first example (eg, "c_c" ). I know that assigning a vector this way is terribly inefficient, but I'm just trying to figure out how the scoping works and why these results are so different. I wouldn't have thought that there would be any scoping issues with the print function, ie, every time print(x) or print(y) is triggered I'd expect the output to be printed to the console.

The same thing occurs when I assign out to the parent frame rather than the global environment, ie, out <<- c(out, x) . Funnily enough, if I just use the print syntax, I can compute large numbers of combinations easily; but when using vector assignments, anything over four elements results in a recursive mess that fries the system.

So I guess the questions are,

  1. Why does it seem to be triggering the vector assignment more frequently that the print function, when they're called in the same place?

  2. Is there a better way to implement this sort of function?

The perm function is based on one of the functions found in this blog post .

  1. Why does it seem to be triggering the vector assignment more frequently that the print function, when they're called in the same place?

The issue is that print returns the first argument while assign returns the assigned value. Take:

b <- assign("a", 2)
b
#R> [1] 2

Thus, your function should be:

out <- c()
perm <- function(x) {
  n <- length(x)
  if (n == 1) {
    assign('out', c(out, x), envir = .GlobalEnv)
    x
  } else {
    y <- NULL
    for (i in 1:n) {
      y <- paste(x[i], perm(x[-i]), sep = "_")
    }
    assign('out', c(out, y), envir = .GlobalEnv)
    y
  }
}

perm(c("a","b","c"))
#R> [1] [1] "c_b_a"
out
#R> [1] "c"     "b"     "c_b"   "c"     "a"     "c_a"   "b"     "a"     "b_a"   "c_b_a"

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM