In the Pymc3 example for multilevel linear regression (the example is here , with the radon
data set from Gelman et al.'s (2007)), the intercepts (for different counties) and slopes (for apartment with and without basement) each have a Normal prior. How can I model them together with a multivariate normal prior, so that I can examine the correlation between them?
The hierarchical model given in the example is like this:
with pm.Model() as hierarchical_model:
# Hyperpriors for group nodes
mu_a = pm.Normal('mu_a', mu=0., sd=100**2)
sigma_a = pm.HalfCauchy('sigma_a', 5)
mu_b = pm.Normal('mu_b', mu=0., sd=100**2)
sigma_b = pm.HalfCauchy('sigma_b', 5)
# Intercept for each county, distributed around group mean mu_a
# Above we just set mu and sd to a fixed value while here we
# plug in a common group distribution for all a and b (which are
# vectors of length n_counties).
a = pm.Normal('a', mu=mu_a, sd=sigma_a, shape=n_counties)
# Intercept for each county, distributed around group mean mu_a
b = pm.Normal('b', mu=mu_b, sd=sigma_b, shape=n_counties)
# Model error
eps = pm.HalfCauchy('eps', 5)
radon_est = a[county_idx] + b[county_idx] * data.floor.values
# Data likelihood
radon_like = pm.Normal('radon_like', mu=radon_est, sd=eps, observed=data.log_radon)
hierarchical_trace = pm.sample(2000)
And I'm trying to make some change to the priors
with pm.Model() as correlation_model:
# Hyperpriors for group nodes
mu_a = pm.Normal('mu_a', mu=0., sd=100**2)
mu_b = pm.Normal('mu_b', mu=0., sd=100**2)
# here I want to model a and b together
# I borrowed some code from a multivariate normal model
# but the code does not work
sigma = pm.HalfCauchy('sigma', 5, shape=2)
C_triu = pm.LKJCorr('C_triu', n=2, p=2)
C = T.fill_diagonal(C_triu[np.zeros((2,2), 'int')], 1)
cov = pm.Deterministic('cov', T.nlinalg.matrix_dot(sigma, C, sigma))
tau = pm.Deterministic('tau', T.nlinalg.matrix_inverse(cov))
a, b = pm.MvNormal('mu', mu=(mu_a, mu_b), tau=tau,
shape=(n_counties, n_counties))
# Model error
eps = pm.HalfCauchy('eps', 5)
radon_est = a[county_idx] + b[county_idx] * data.floor.values
# Data likelihood
radon_like = pm.Normal('radon_like', mu=radon_est, sd=eps, observed=data.log_radon)
correlation_trace = pm.sample(2000)
Here is the error message I got:
File "<ipython-input-108-ce400c54cc39>", line 14, in <module>
tau = pm.Deterministic('tau', T.nlinalg.matrix_inverse(cov))
File "/home/olivier/anaconda3/lib/python3.5/site-packages/theano/gof/op.py", line 611, in __call__
node = self.make_node(*inputs, **kwargs)
File "/home/olivier/anaconda3/lib/python3.5/site-packages/theano/tensor/nlinalg.py", line 73, in make_node
assert x.ndim == 2
AssertionError
Clearly I've made some mistakes about the covariance matrix, but I'm new to pymc3
and completely new to theano
so have no idea how to fix it. I gather this should be a rather common use case so maybe there have been some examples on it? I just can't find them.
The full replicable code and data can be seen on the example page (link given above). I didn't include it here because it's too long and also I thought those familiar with pymc3
are very likely already quite familiar with it:)
You forgot to add one line when creating the covariance matrix you miss-specified the shape of the MvNormal. Your model should look something like this:
with pm.Model() as correlation_model:
mu = pm.Normal('mu', mu=0., sd=10, shape=2)
sigma = pm.HalfCauchy('sigma', 5, shape=2)
C_triu = pm.LKJCorr('C_triu', n=2, p=2)
C = tt.fill_diagonal(C_triu[np.zeros((2,2), 'int')], 1.)
sigma_diag = tt.nlinalg.diag(sigma) # this line
cov = tt.nlinalg.matrix_dot(sigma_diag, C, sigma_diag)
tau = tt.nlinalg.matrix_inverse(cov)
ab = pm.MvNormal('ab', mu=mu, tau=tau, shape=(n_counties, 2))
eps = pm.HalfCauchy('eps', 5)
radon_est = ab[:,0][county_idx] + ab[:,1][county_idx] * data.floor.values
radon_like = pm.Normal('radon_like', mu=radon_est, sd=eps, observed=data.log_radon)
trace = pm.sample(2000)
Notice that alternatively, you can evaluate the correlation of the intercept and the slope from the posterior of hierarchical_model
. You can use a frequentist method or build another Bayesian model, that takes as the observed data the result of hierarchical_model
. May be this could be faster.
EDIT
If you want to evaluate the correlation of two variables from the posterior you can do something like.
chain = hierarchical_trace[100:]
x_0 = chain['mu_a']
x_1 = chain['mu_b']
X = np.vstack((x_0, x_1)).T
and then you can run the following model:
with pm.Model() as correlation:
mu = pm.Normal('mu', mu=0., sd=10, shape=2)
sigma = pm.HalfCauchy('sigma', 5, shape=2)
C_triu = pm.LKJCorr('C_triu', n=2, p=2)
C = tt.fill_diagonal(C_triu[np.zeros((2,2), 'int')], 1.)
sigma_diag = tt.nlinalg.diag(sigma)
cov = tt.nlinalg.matrix_dot(sigma_diag, C, sigma_diag)
tau = tt.nlinalg.matrix_inverse(cov)
yl = pm.MvNormal('yl', mu=mu, tau=tau, shape=(2, 2), observed=X)
trace = pm.sample(5000, pm.Metropolis())
You can replace x_0 and x_1 according to your needs. For example you may want to do:
x_0 = np.random.normal(chain['mu_a'], chain['sigma_a'])
x_1 = np.random.normal(chain['mu_b'], chain['sigma_b'])
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.