I have below model for which I seek estimation of parameters using pymc3
import pandas as pd
import pymc3 as pm
import arviz as arviz
myData = pd.DataFrame.from_dict({
'Unnamed: 0': {
0: 1, 1: 2, 2: 3, 3: 4, 4: 5, 5: 6, 6: 7, 7: 8, 8: 9, 9: 10,
10: 11, 11: 12, 12: 13, 13: 14, 14: 15, 15: 16, 16: 17, 17: 18, 18: 19, 19: 20,
20: 21, 21: 22, 22: 23, 23: 24, 24: 25, 25: 26, 26: 27, 27: 28, 28: 29, 29: 30,
30: 31, 31: 32, 32: 33, 33: 34, 34: 35, 35: 36, 36: 37, 37: 38},
'y': {
0: 0.0079235409492941, 1: 0.0086530073429249, 2: 0.0297400780486734, 3: 0.0196358416326437, 4: 0.0023902064076204, 5: 0.0258055591736283, 6: 0.17394835142698, 7: 0.156463554455613, 8: 0.329388185725557, 9: 0.0076443508881763,
10: 0.0162081480398152, 11: 0.0, 12: 0.0015759139941696, 13: 0.420025972703085, 14: 0.0001226236519444, 15: 0.133061480234834, 16: 0.565454216154227, 17: 0.0002819734812997, 18: 0.000559715156383, 19: 0.0270686389659072,
20: 0.918300537689865, 21: 7.8262468302e-06, 22: 0.0073241434191945, 23: 0.0, 24: 0.0, 25: 0.0, 26: 0.0, 27: 0.0, 28: 0.0, 29: 0.0,
30: 0.174071274611405, 31: 0.0432109713717948, 32: 0.0544400838264943, 33: 0.0, 34: 0.0907049925221286, 35: 0.616680102647887, 36: 0.0, 37: 0.0},
'x': {
0: 23.8187587698947, 1: 15.9991138359515, 2: 33.6495930512881, 3: 28.555818797764, 4: -52.2967967248258, 5: -91.3835208788233, 6: -73.9830692708321, 7: -5.16901145289629, 8: 29.8363012310241, 9: 10.6820057903939,
10: 19.4868517164395, 11: 15.4499668436458, 12: -17.0441644773509, 13: 10.7025053739577, 14: -8.6382953428539, 15: -32.8892974839165, 16: -15.8671863161348, 17: -11.237248036145, 18: -7.37978020066205, 19: -3.33500586334862,
20: -4.02629933182873, 21: -20.2413384726948, 22: -54.9094885578775, 23: -48.041459120976, 24: -52.3125732905322, 25: -35.6269065970458, 26: -62.0296155423529, 27: -49.0825017152659, 28: -73.0574478287598, 29: -50.9409090127938,
30: -63.4650928035253, 31: -55.1263264283842, 32: -52.2841103768755, 33: -61.2275334149805, 34: -74.2175990067417, 35: -68.2961107804698, 36: -76.6834643609286, 37: -70.16769103228}
})
with pm.Model() as myModel :
beta0 = pm.Normal('intercept', 0, 1)
beta1 = pm.Normal('x', 0, 1)
mu = beta0 + beta1 * myData['x'].values
pm.Bernoulli('obs', p = pm.invlogit(mu), observed = myData['y'].values)
with myModel :
calc = pm.sample(50000, tune = 10000, step = pm.Metropolis(), random_seed = 1000)
arviz.summary(calc, round_to = 10)
mean sd hdi_3% hdi_97% mcse_mean mcse_sd ess_bulk ess_tail r_hat
intercept -2.537501 0.599667 -3.707061 -1.450243 0.004375 0.003118 18893.344191 22631.772985 1.000070
x 0.033750 0.024314 -0.007871 0.081619 0.000181 0.000133 18550.620475 20113.739639 1.000194
Now I changed above model to this,
mu = beta0 + beta1 * myData['x'].values * 0
With this change I get below result,
mean sd hdi_3% hdi_97% mcse_mean mcse_sd ess_bulk ess_tail r_hat
intercept -2.690874 0.546570 -3.698465 -1.643091 0.003611 0.002565 22980.471424 24806.935727 1.000036
x -0.013861 1.003612 -1.916176 1.826709 0.006874 0.005175 21336.662537 23299.680306 1.000084
I wonder if above estimate is correct. Should not I expect very small estimate for the coefficient beta1
? I see hardly any change for this estimate except just change in sign.
Any pointer is highly appreciated.
" hardly any change for this estimate "
Seems like you are ignoring the sd
, which has a strong change and is behaving as expected. That is, the first version yields 0.034 ± 0.024
(weakly positive); whereas the second correctly reverts to the prior with -0.014 ± 1.00
.
Looking at the input data, none of this seems surprising:
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.