简体   繁体   中英

Implementing exponential General Linear Model in Stan/RStan

I have the following model:

在此处输入图片说明

I wish to learn how to implement this model.

Research and Attempt

I've had a look at the following Poisson GLM as an example:

在此处输入图片说明

```{stan output.var="PoissonGLMQR"}
  data{
    int<lower=1> n;     // number of observations
    int<lower=1> p;     // number of predictors
    matrix[n, p] x;     // design matrix
    int<lower=0> y[n];  // responses
  }
  
  transformed data{
    matrix [n, p] Q_ast;
    matrix [p, p] R_ast;
    matrix [p, p] R_ast_inverse;
    Q_ast = qr_Q(x)[,1:p] * sqrt(n - 1.0);
    R_ast = qr_R(x)[1:p,] / sqrt(n - 1.0);
    R_ast_inverse = inverse(R_ast);
  }
  
  parameters{
    vector[p] theta;  // regression coefficients for predictors
  }
  
  transformed parameters{
    vector[p] beta;
    vector[n] mu;
    mu = exp(Q_ast*theta);
    beta = R_ast_inverse * theta;
  }
  
  model{
     y  ~ poisson(mu);
  }
```

I understand that this example has used QR reparameterization (see section 9.2 of the Stan reference manual). However, since I'm new to Stan, I'm finding this quite intimidating, and I'm unsure of how to implement the exponential GLM in the same way. Does one even need to use the QR reparameterization for the exponential case?

Anyway, my naive attempt at the exponential GLM is the following adaptation of the Poisson GLM code:

```{stan output.var="ExponentialGLM"}
  data{
    int<lower=1> n;     // number of observations
    int<lower=1> p;     // number of predictors
    matrix[n, p] x;     // design matrix
    real<lower=0> y[n];  // responses
  }
  
  transformed data{
    matrix [n, p] Q_ast;
    matrix [p, p] R_ast;
    matrix [p, p] R_ast_inverse;
    Q_ast = qr_Q(x)[,1:p] * sqrt(n - 1.0);
    R_ast = qr_R(x)[1:p,] / sqrt(n - 1.0);
    R_ast_inverse = inverse(R_ast);
  }
  
  parameters{
    vector[p] theta;  // regression coefficients for predictors
  }
  
  transformed parameters{
    vector[p] beta;
    vector[n] mu;
    mu = exp(Q_ast*theta);
    beta = (R_ast_inverse * theta);
  }
  
  model{
     y  ~ exponential(mu);
  }
```

But, I'm totally unsure if this is even the right way to go about coding such a model in Stan. All I did was simply try to adapt the Poisson GLM example to the aforementioned exponential GLM.

I am seeking a demonstration of the exponential GLM and clarifications regarding my misunderstandings above.

Sample Data

    conc  time
 1   6.1   0.8
 2   4.2   3.5
 3   0.5  12.4
 4   8.8   1.1
 5   1.5   8.9
 6   9.2   2.4
 7   8.5   0.1
 8   8.7   0.4
 9   6.7   3.5
10   6.5   8.3
11   6.3   2.6
12   6.7   1.5
13   0.2  16.6
14   8.7   0.1
15   7.5   1.3

How about something like this?

  1. We start by reading in the sample data.

     df <- read.table(text = " conc time 1 6.1 0.8 2 4.2 3.5 3 0.5 12.4 4 8.8 1.1 5 1.5 8.9 6 9.2 2.4 7 8.5 0.1 8 8.7 0.4 9 6.7 3.5 10 6.5 8.3 11 6.3 2.6 12 6.7 1.5 13 0.2 16.6 14 8.7 0.1 15 7.5 1.3", header = T);
  2. We define the Stan model, as a simple Gamma (Exponential) model with a log-link.

     model <- " data { int N; // Number of observations int K; // Number of parameters real y[N]; // Response matrix[N,K] X; // Model matrix } parameters { vector[K] beta; // Model parameters } transformed parameters { vector[N] lambda; lambda = exp(-X * beta); // log-link } model { // Priors beta[1] ~ cauchy(0, 10); for (i in 2:K) beta[i] ~ cauchy(0, 2.5); y ~ exponential(lambda); } "

    Here I assume (standard) weakly informative Cauchy priors on the beta model parameters.

  3. We now fit the model to the data.

     library(rstan); options(mc.cores = parallel::detectCores()); rstan_options(auto_write = TRUE); X <- model.matrix(time ~ conc, data = df); model.stan <- stan( model_code = model, data = list( N = nrow(df), K = ncol(X), y = df$time, X = X), iter = 4000);
  4. To compare point estimates, we also fit a Gamma GLM to the data using glm .

     model.glm <- glm(time ~ conc, data = df, family = Gamma(link = "log"));
  5. The Stan model parameter estimates are

    summary(model.stan, "beta")$summary; # mean se_mean sd 2.5% 25% 50% #beta[1] 2.9371457 0.016460000 0.62017934 1.8671652 2.5000356 2.8871936 #beta[2] -0.3099799 0.002420744 0.09252423 -0.5045937 -0.3708111 -0.3046333 # 75% 97.5% n_eff Rhat #beta[1] 3.3193334 4.3078478 1419.629 1.002440 #beta[2] -0.2461567 -0.1412095 1460.876 1.002236

    The GLM model parameter estimates are

    summary(model.glm)$coef; # Estimate Std. Error t value Pr(>|t|) #(Intercept) 2.8211487 0.54432115 5.182875 0.0001762685 #conc -0.3013355 0.08137986 -3.702827 0.0026556952

    We see good agreement between the Stan point estimates for the means and Gamma-GLM parameter estimates.

  6. We plot parameter estimates from both the Stan and glm models.

     library(tidyverse); as.data.frame(summary(model.stan, "beta")$summary) %>% rownames_to_column("Parameter") %>% ggplot(aes(x = `50%`, y = Parameter)) + geom_point(size = 3) + geom_segment(aes(x = `2.5%`, xend = `97.5%`, yend = Parameter), size = 1) + geom_segment(aes(x = `25%`, xend = `75%`, yend = Parameter), size = 2) + labs(x = "Median (plus/minus 95% and 50% CIs)") + geom_point( data = data.frame(summary(model.glm)$coef, row.names = c("beta[1]", "beta[2]")) %>% rownames_to_column("Parameter"), aes(x = Estimate, y = Parameter), colour = "red", size = 4, shape = 1)

    在此处输入图片说明

    glm estimates are shown in red.


Update (Fit Stan model using QR re-parametrisation)

  1. Stan model with QR re-parametrisation.

     model.QR <- " data { int N; // Number of observations int K; // Number of parameters real y[N]; // Response matrix[N,K] X; // Model matrix } transformed data { matrix[N, K] Q; matrix[K, K] R; matrix[K, K] R_inverse; Q = qr_Q(X)[, 1:K]; R = qr_R(X)[1:K, ]; R_inverse = inverse(R); } parameters { vector[K] theta; } transformed parameters { vector[N] lambda; lambda = exp(-Q * theta); // log-link } model { y ~ exponential(lambda); } generated quantities { vector[K] beta; // Model parameters beta = R_inverse * theta; } "

    In the QR decomposition, we have lambda = exp(- X * beta) = exp(- Q * R * beta) = exp(- Q * theta) , where theta = R * beta and therefore beta = R^-1 * theta .

    Note that we do not specify any priors on theta; in Stan this defaults to flat (ie uniform) priors.

  2. Fit Stan model to the data.

     library(rstan); options(mc.cores = parallel::detectCores()); rstan_options(auto_write = TRUE); X <- model.matrix(time ~ conc, data = df); model.stan.QR <- stan( model_code = model.QR, data = list( N = nrow(df), K = ncol(X), y = df$time, X = X), iter = 4000);
  3. Parameter estimates

    summary(model.stan.QR, "beta")$summary; # mean se_mean sd 2.5% 25% 50% #beta[1] 2.9637547 0.009129250 0.64383609 1.8396681 2.5174800 2.9194682 #beta[2] -0.3134252 0.001321584 0.09477156 -0.5126905 -0.3740475 -0.3093937 # 75% 97.5% n_eff Rhat #beta[1] 3.3498585 4.3593912 4973.710 0.9998545 #beta[2] -0.2478029 -0.1395686 5142.408 1.0003236

    You can see that parameter estimates between the two Stan models (with and without QR re-parametrisation) agree very well.

    If you were to ask me whether QR re-parametrisation makes a (big/any) difference, I'd say "probably not in this case"; Andrew Gelman and others have often emphasised that using even very weakly informative priors will help with convergence and should be preferred over flat (uniform) priors; I would always try to use weakly informative priors on all parameters, and start with a model without QR re-parametrisation; if convergence is poor, I would then try to optimise my model in a next step.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM