I would like to draw samples from a probability distribution with CDF 1 - e^(-x^2)
.
Is there a method in python/scipy/etc. to enable you to sample from a probability distribution given only its CDF?
To create a custom random variable class given a CDF you could subclass scipy.rv_continuous
and override rv_continuous._cdf
. This will then automatically generate the corresponding PDF and other statistical information about your distribution, eg
import matplotlib.pyplot as plt
import numpy as np
from scipy import stats
class MyRandomVariableClass(stats.rv_continuous):
def __init__(self, xtol=1e-14, seed=None):
super().__init__(a=0, xtol=xtol, seed=seed)
def _cdf(self, x):
return 1-np.exp(-x**2)
if __name__ == "__main__":
my_rv = MyRandomVariableClass()
# sample distribution
samples = my_rv.rvs(size = 1000)
# plot histogram of samples
fig, ax1 = plt.subplots()
ax1.hist(list(samples), bins=50)
# plot PDF and CDF of distribution
pts = np.linspace(0, 5)
ax2 = ax1.twinx()
ax2.set_ylim(0,1.1)
ax2.plot(pts, my_rv.pdf(pts), color='red')
ax2.plot(pts, my_rv.cdf(pts), color='orange')
fig.tight_layout()
plt.show()
To add on to the solution by Heike, you could use Inverse Transform Sampling to sample via the CDF:
import math, random
import matplotlib.pyplot as plt
def inverse_cdf(y):
# Computed analytically
return math.sqrt(math.log(-1/(y - 1)))
def sample_distribution():
uniform_random_sample = random.random()
return inverse_cdf(uniform_random_sample)
x = [sample_distribution() for i in range(10000)]
plt.hist(x, bins=50)
plt.show()
I was very curious to see how this worked in SciPy, too. It actually looks like it does something very similar to the above. Based on the SciPy docs :
The default method _rvs relies on the inverse of the cdf, _ppf, applied to a uniform random variate. In order to generate random variates efficiently, either the default _ppf needs to be overwritten (eg if the inverse cdf can expressed in an explicit form) or a sampling method needs to be implemented in a custom _rvs method.
And based on the SciPy source code , the _ppf
(ie, the inverse of the CDF) does in fact look to be approximated numerically if not specified explicitly. Very cool!
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.