簡體   English   中英

如何使用 brms(R 包)生成我需要在 pystan 中重現模型估計的 Stan 代碼?

[英]How to use brms (R package) to generate the Stan code I need to reproduce the models estimation in pystan?

我已經開發了使用 R 包 brms 來估計模型的管道,現在我需要將其轉換為 python。 我知道我在 python 中最接近 brms 的是 pystan,我必須在其中使用 Stan 語法編寫我的模型。 我想知道是否有一個 brms 函數可以生成可用作 python 中 pystan.StanModel 函數的 model_code 參數的 Stan 代碼。 我曾嘗試使用 make_stancode 函數生成的代碼,但沒有奏效。

這是 make_stancode 生成的代碼:

life_span_code = """
// generated with brms 2.10.0
functions {

  /* compute monotonic effects
   * Args:
   *   scale: a simplex parameter
   *   i: index to sum over the simplex
   * Returns:
   *   a scalar between 0 and 1
   */
  real mo(vector scale, int i) {
    if (i == 0) {
      return 0;
    } else {
      return rows(scale) * sum(scale[1:i]);
    }
  }
}
data {
  int<lower=1> N;  // number of observations
  vector[N] Y;  // response variable
  int<lower=1> Ksp;  // number of special effects terms
  int<lower=1> Imo;  // number of monotonic variables
  int<lower=2> Jmo[Imo];  // length of simplexes
  // monotonic variables
  int Xmo_1[N];
  // prior concentration of monotonic simplexes
  vector[Jmo[1]] con_simo_1;
  int prior_only;  // should the likelihood be ignored?
}
transformed data {
}
parameters {
  // temporary intercept for centered predictors
  real Intercept;
  // special effects coefficients
  vector[Ksp] bsp;
  // simplexes of monotonic effects
  simplex[Jmo[1]] simo_1;
  real<lower=0> sigma;  // residual SD
}
transformed parameters {
}
model {
  // initialize linear predictor term
  vector[N] mu = Intercept + rep_vector(0, N);
  for (n in 1:N) {
    // add more terms to the linear predictor
    mu[n] += (bsp[1]) * mo(simo_1, Xmo_1[n]);
  }
  // priors including all constants
  target += student_t_lpdf(Intercept | 3, 65, 12);
  target += dirichlet_lpdf(simo_1 | con_simo_1);
  target += student_t_lpdf(sigma | 3, 0, 12)
    - 1 * student_t_lccdf(0 | 3, 0, 12);
  // likelihood including all constants
  if (!prior_only) {
    target += normal_lpdf(Y | mu, sigma);
  }
}
generated quantities {
  // actual population-level intercept
  real b_Intercept = Intercept;
}
"""

這是我在python中使用的代碼:

## Libraries
import pandas as pd
import pystan
import numpy as np
import random as rd

## Build data for life span example with ordenated factors

income_options =  ["below_20", "20_to_40", "40_to_100", "greater_100"]
income_mean = [30, 60, 70, 75]
income_factor = [0, 1, 2, 3]

dict_data = {'income_options' : income_options,
             'income_mean' : income_mean,
             'income_factor' :  income_factor}

map_df = pd.DataFrame(dict_data)

income_rep = rd.sample(income_factor*25, 100)

rand_inc = np.random.normal(loc = 0, scale = 1, size = 100).tolist()


data_df = pd.DataFrame({'income_factor': income_rep,
                        'rand_inc' : rand_inc})

data_df = pd.merge(data_df, map_df, on = 'income_factor')

data_df['ls'] = data_df['income_mean'] + data_df['rand_inc']

N = data_df.shape[0]
Y = data_df['ls'].tolist()
K = 1
X = [1]*N
Ksp = 1
Imo = 1
Xmo_1 = data_df['income_factor'].tolist()
Jmo = len(data_df['income_factor'].unique().tolist())-1
con_simo_1 = [1]*Jmo
prior_only = 0


life_span_data = {'N' : N,
                  'Y' : Y,
                  'K' : K,
                  'X' : X,
                  'Ksp' : Ksp,
                  'Imo' : Imo,
                  'Xmo_1' : Xmo_1,
                  'Jmo' : Jmo,
                  'con_simo_1' : con_simo_1,
                  'prior_only' : prior_only}

life_span_sm = pystan.StanModel(model_code = life_span_code)
life_span_fit = life_span_sm.sampling(data= life_span_data, iter=1000, chains=2)

這是我收到的錯誤:

“運行時錯誤:異常:在上下文中聲明和發現的數字維度不匹配;處理階段=數據初始化;變量名稱=Jmo;dims 聲明=(1);dims 發現=()(在第 24 行的‘未知文件名’中)”

感謝所有的幫助

原來問題不在於 brms 生成的模型代碼,而在於我定義參數的方式。 特別是,Jmo 必須是一個列表而不是一個整數。

N = data_df.shape[0]
Y = data_df['ls'].tolist()
K = 1
X = [1]*N
Ksp = 1
Imo = 1
Xmo_1 = data_df['income_factor'].tolist()

## The following two lines have changed
Jmo = [len(data_df['income_factor'].unique().tolist())-1]
con_simo_1 = [1, 1, 1]
## End of changes

prior_only = 0

其余代碼相同。 我仍然希望澄清為什么某些參數可以聲明為整數而其他參數只能聲明為列表。

再次感謝

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM