How to sample from a distribution given the CDF in Python

Question

I would like to draw samples from a probability distribution with CDF 1 - e^(-x^2) .

Is there a method in python/scipy/etc. to enable you to sample from a probability distribution given only its CDF?

Answer 1

To create a custom random variable class given a CDF you could subclass scipy.rv_continuous and override rv_continuous._cdf . This will then automatically generate the corresponding PDF and other statistical information about your distribution, eg

import matplotlib.pyplot as plt
import numpy as np
from scipy import stats

class MyRandomVariableClass(stats.rv_continuous):
    def __init__(self, xtol=1e-14, seed=None):
        super().__init__(a=0, xtol=xtol, seed=seed)

    def _cdf(self, x):
        return 1-np.exp(-x**2)


if __name__ == "__main__":
    my_rv = MyRandomVariableClass()

    # sample distribution
    samples = my_rv.rvs(size = 1000)

    # plot histogram of samples
    fig, ax1 = plt.subplots()
    ax1.hist(list(samples), bins=50)

    # plot PDF and CDF of distribution
    pts = np.linspace(0, 5)
    ax2 = ax1.twinx()
    ax2.set_ylim(0,1.1)
    ax2.plot(pts, my_rv.pdf(pts), color='red')
    ax2.plot(pts, my_rv.cdf(pts), color='orange')

    fig.tight_layout()
    plt.show()

Answer 2

Inverse Transform Sampling

To add on to the solution by Heike, you could use Inverse Transform Sampling to sample via the CDF:

import math, random
import matplotlib.pyplot as plt

def inverse_cdf(y):
    # Computed analytically
    return math.sqrt(math.log(-1/(y - 1)))

def sample_distribution():
    uniform_random_sample = random.random()
    return inverse_cdf(uniform_random_sample)

x = [sample_distribution() for i in range(10000)]
plt.hist(x, bins=50)
plt.show()

How SciPy Does It

I was very curious to see how this worked in SciPy, too. It actually looks like it does something very similar to the above. Based on the SciPy docs :

The default method _rvs relies on the inverse of the cdf, _ppf, applied to a uniform random variate. In order to generate random variates efficiently, either the default _ppf needs to be overwritten (eg if the inverse cdf can expressed in an explicit form) or a sampling method needs to be implemented in a custom _rvs method.

And based on the SciPy source code , the _ppf (ie, the inverse of the CDF) does in fact look to be approximated numerically if not specified explicitly. Very cool!

How to sample from a distribution given the CDF in Python

Question

2 answers

solution1
4 ACCPTED 2020-03-06 09:07:54

solution2
2 2021-03-14 22:55:55

Inverse Transform Sampling

How SciPy Does It

How to sample from a distribution given the CDF in Python

Question

2 answers

solution1 4 ACCPTED 2020-03-06 09:07:54

solution2 2 2021-03-14 22:55:55

Inverse Transform Sampling

How SciPy Does It

solution1
4 ACCPTED 2020-03-06 09:07:54

solution2
2 2021-03-14 22:55:55