Using multiprocessing in emcee library inside a class

Question

I have tried to use emcee library to implement Monte Carlo Markov Chain inside a class and also make multiprocessing module works but after running such a test code:

import numpy as np
import emcee
import scipy.optimize as op
# Choose the "true" parameters.
m_true = -0.9594
b_true = 4.294
f_true = 0.534

# Generate some synthetic data from the model.
N = 50
x = np.sort(10*np.random.rand(N))
yerr = 0.1+0.5*np.random.rand(N)
y = m_true*x+b_true
y += np.abs(f_true*y) * np.random.randn(N)
y += yerr * np.random.randn(N)

class modelfit():
      def  __init__(self):
          self.x=x
          self.y=y
          self.yerr=yerr
          self.m=-0.6
          self.b=2.0
          self.f=0.9
      def get_results(self):
          def func(a):
              model=a[0]*self.x+a[1]
              inv_sigma2 = 1.0/(self.yerr**2 + model**2*np.exp(2*a[2]))
              return 0.5*(np.sum((self.y-model)**2*inv_sigma2 + np.log(inv_sigma2)))
          result = op.minimize(func, [self.m, self.b, np.log(self.f)],options={'gtol': 1e-6, 'disp': True})
          m_ml, b_ml, lnf_ml = result["x"]
          return result["x"]
      def lnprior(self,theta):
          m, b, lnf = theta
          if -5.0 < m < 0.5 and 0.0 < b < 10.0 and -10.0 < lnf < 1.0:
             return 0.0
          return -np.inf
      def lnprob(self,theta):
          lp = self.lnprior(theta)
          likelihood=self.lnlike(theta)
          if not np.isfinite(lp):
             return -np.inf
          return lp + likelihood
      def lnlike(self,theta):
          m, b, lnf = theta
          model = m * self.x + b
          inv_sigma2 = 1.0/(self.yerr**2 + model**2*np.exp(2*lnf))
          return -0.5*(np.sum((self.y-model)**2*inv_sigma2 - np.log(inv_sigma2)))
      def run_mcmc(self,nstep):
          ndim, nwalkers = 3, 100
          pos = [self.get_results() + 1e-4*np.random.randn(ndim) for i in range(nwalkers)]
          self.sampler = emcee.EnsembleSampler(nwalkers, ndim, self.lnprob,threads=10)
          self.sampler.run_mcmc(pos, nstep)
test=modelfit()
test.x=x
test.y=y
test.yerr=yerr
test.get_results()
test.run_mcmc(5000)

I got this error message :

File "MCMC_model.py", line 157, in run_mcmc
    self.sampler.run_mcmc(theta0, nstep)
  File "build/bdist.linux-x86_64/egg/emcee/sampler.py", line 157, in run_mcmc
  File "build/bdist.linux-x86_64/egg/emcee/ensemble.py", line 198, in sample
  File "build/bdist.linux-x86_64/egg/emcee/ensemble.py", line 382, in _get_lnprob
  File "build/bdist.linux-x86_64/egg/emcee/interruptible_pool.py", line 94, in map
  File "/vol/aibn84/data2/zahra/anaconda/lib/python2.7/multiprocessing/pool.py", line 558, in get
    raise self._value
cPickle.PicklingError: Can't pickle <type 'instancemethod'>: attribute lookup __builtin__.instancemethod failed

I reckon it has something to do with how I have used multiprocessing in the class but I could not figure out how I could keep the structure of my class the way it is and meanwhile use multiprocessing as well??!!

I will appreciate for any tips.

PS I have to mention the code works perfectly if I remove threads=10 from the last function.

Answer 1

There are a number of SO questions that discuss what's going on:

…including this one, which seems to be your response… to nearly the same question:

https://stackoverflow.com/a/25388586/2379433

However, the difference here is that you are not using multiprocessing directly -- but emcee is. Therefore, the pathos.multiprocessing solution (from the links above) is not available for you. Since emcee uses cPickle , you'll have to stick to things that pickle knows how to serialize. You are out of luck for class instances. Typical workarounds are to either use copy_reg to register the type of object you want to serialize, or to add a __reduce__ method to tell python how to serialize it. You can see several of the answers from the above links suggest similar things… but none enable you to keep the class the way you have written it.

Answer 2

For the record, you can now create a pathos.multiprocessing pool, and pass it to emcee using the pool argument. However, be aware that the overhead of multiprocessing can actually slow things down, unless your likelihood is particularly time-consuming to compute.

Using multiprocessing in emcee library inside a class

Question

2 answers

solution1
1 2015-03-29 12:24:09

solution2
1 2016-01-17 20:53:51

Using multiprocessing in emcee library inside a class

Question

2 answers

solution1 1 2015-03-29 12:24:09

solution2 1 2016-01-17 20:53:51

solution1
1 2015-03-29 12:24:09

solution2
1 2016-01-17 20:53:51