简体   繁体   中英

How to use functions of objects declared as data when declaring other objects in Stan/RStan?

Suppose I want to declare a matrix mat with ncol columns, where ncol is a function of some data foo . I could accomplish this with the below data block, which requires I pass Stan a data file with values for ncol .

How can I accomplish this within my Stan program without including ncol in my data file?

data{
  vector[2] foo;
  int ncol; // assume ncol = sum(foo)
  matrix[2, ncol] mat;
}

Below I describe some approaches I've considered, but I'm not sure whether and why they (don't) work.

Approach 1: Operate-and-assign in data block

Operate-and-assign to compute ncol within the data block. However, the Stan Reference Manual states that assignment is not allowed in the data block. Therefore, this shouldn't work:

data{
  vector[2] foo;
  int ncol = sum(foo); // illegal: assignment isn't allowed in data!
  matrix[2, ncol] mat;
}

Approach 2: move operate-and-assignment to the transformed_data block

Assign in the transformed_data block. My understanding is that I can't pass data directly to objects in the transformed_data block, which means I wouldn't be able to pass data to mat if I moved it there (see below). Similarly, I couldn't fix this by keeping mat in data because its dimensions would then depend on transformed_data , which is illegal.

data{
  vector[2] foo;
}

transformed_data{
  int ncol = sum(foo);
  matrix[2, ncol] mat; // can't read and assign to `mat` because it's not in `data`?
}

Approach 3: Operate-but-don't-assign in data block

Compute but do not assign the value of ncol in the declaration of mat . I don't think this would work unless foo is always read in first and am not sure if it's legal anyway.

data{
  vector[2] foo;
  matrix[2, sum(foo)] mat; // possible illegal in data block?
}

Approach 3 succeeds at this while all of the other approaches do not. It is legal because:

  • Objects declared in the data block can be used in subsequent declarations within the data block;
  • It does not assign values to an object declared in the data block. This is a bit subtle, since it does compute values used to declare an object.

In the following, I show that Approach 3 succeeds at doing what I sought to do in the question while the others do not.

Baseline Example: passing rstan extra data

# Code that does not be changed across runs

# Packages
library(rstan)

# Program file name
relpath <- "example_stan_program.stan"

# Function to run the Stan program with specified presets
run_example <- function(model_code, data){
  stan(model_code = model_code,
       data = data,
       iter = 1, warmup = 0, 
       chains = 1, cores = 1, 
       refresh = 0,
       algorithm = "Fixed_param", 
       verbose = F)
}
# Baseline Stan program
baseline <- "
data{ 
  array[2] int foo;
  int bar; // assumes bar = sum(foo)
  matrix[2, bar] mat;
}

transformed data{
  print(foo);
  print(bar);
  print(mat);
}
"

# Baseline data
dat_baseline <- list(
  foo = c(1,1),
  bar = sum(c(1,1)),
  mat = matrix(1:4, nrow = 2, ncol = 2)
)

# Run program
out <- run_example(baseline, dat_baseline)
#> [1,1]
#> 2
#> [[1,3],[2,4]]

Approach 3: compute without assignment in data block

operate <- "
data{ 
  array[2] int foo;
  matrix[2, sum(foo)] mat;
}

transformed data{
  print(foo);
  print(mat);
}
"

# Example Data
dat_operate <- list(
  foo = c(1,1),
  mat = matrix(1:4, nrow = 2, ncol = 2)
)

# Run program (we could also get bar in transformed_data if desired)
out <- run_example(operate, dat_operate)
#> [1,1]
#> [[1,3],[2,4]]

Demonstrating other approaches don't succeed

Approach 1: operate-and-assign in data (error)

operate <- "
data{ 
  array[2] int foo;
  int bar = sum(foo); // cannot assign in data block
  matrix[2, bar] mat;
}

transformed data{
  print(foo);
  print(bar);
  print(mat);
}
"

# Example Data
dat_operate <- list(
  foo = c(1,1),
  mat = matrix(1:4, nrow = 2, ncol = 2)
)

# Run program
out <- run_example(operate, dat_operate)
#> Error in stanc(file = file, model_code = model_code, model_name = model_name, : 0
#> 
#> Syntax error in 'string', line 3, column 12 to column 15, parsing error:
#> 
#> Cannot assign to variables in the `data` or `parameters` blocks; expected ';'
#> after variable declaration.

Approach 2: operate and assign in transformed data (no error but empty matrix)

operate_tr <- "
data{ 
  array[2] int foo;
}

transformed data{
  int bar = sum(foo);
  matrix[2, bar] mat; // empty because values not read in data block

  print(foo);
  print(bar);
  print(mat);
}
"

# Example Data
dat_operate_tr <- list(
  foo = c(1,1),
  mat = matrix(1:4, nrow = 2, ncol = 2)
)

# Run program
out <- run_example(operate_tr, dat_operate_tr)
#> [1,1]
#> 2
#> [[nan,nan],[nan,nan]]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM