简体   繁体   中英

scipy.stats attribute `entropy` for continuous distributions doesn't work manually

Each continuous distribution in scipy.stats comes with an attribute that calculates its differential entropy: .entropy . Unlike the normal distribution ( norm ) and others that have a closed-form solution for entropy, other distributions have to rely on numerical integration.

Trying to find out which function the .entropy attribute is calling in those cases, I found a function called _entropy in scipy.stats._distn_infrastructure.py that does so with integrate.quad(pdf) (numerical integration).

But when I try to compare the two approaches (the attribute .entropy vs. numerical integration with the function _entropy ), the function gives an error:

AttributeError: 'rv_frozen' object has no attribute '_pdf'

Why does the distribution's attribute .entropy calculate fine, but the function _entropy gives an error?

import numpy as np
from scipy import integrate 
from scipy.stats import norm, johnsonsu
from scipy.special import entr

def _entropy(self, *args): #from _distn_infrastructure.py
    def integ(x):
        val = self._pdf(x, *args)
        return entr(val)

    # upper limit is often inf, so suppress warnings when integrating
    # _a, _b = self._get_support(*args)
    _a, _b = -np.inf, np.inf   
    with np.errstate(over='ignore'):
        h = integrate.quad(integ, _a, _b)[0]

    if not np.isnan(h):
        return h
    else:
        # try with different limits if integration problems
        low, upp = self.ppf([1e-10, 1. - 1e-10], *args)
        if np.isinf(_b):
            upper = upp
        else:
            upper = _b
        if np.isinf(_a):
            lower = low
        else:
            lower = _a
    return integrate.quad(integ, lower, upper)[0]

Using the attribute works fine:

print(johnsonsu(a=2.55,b=2.55).entropy())

returns 0.9503703091220894

But the function does not:

print(_entropy(johnsonsu(a=2.55,b=2.55)))

returns the error AttributeError: 'rv_frozen' object has no attribute '_pdf' , even though johnsonsu does have this attribute :

def _pdf(self, x, a, b):
    # johnsonsu.pdf(x, a, b) = b / sqrt(x**2 + 1) *
    #                          phi(a + b * log(x + sqrt(x**2 + 1)))
    x2 = x*x
    trm = _norm_pdf(a + b * np.log(x + np.sqrt(x2+1)))
    return b*1.0/np.sqrt(x2+1.0)*trm

Which function is the attribute .entropy calling then in the case of the johnsonsu ?

You want either johnsonsu(a=2.55,b=2.55).entropy() if you are using frozen distributions or johnsonsu.entropy(a=2.55,b=2.55) otherwise.

The why part of your question is basically that leading underscore in _entropy means "implementation detail, don't call directly". A longer answer is that frozen distributions wrap a distribution instance (self.dist), and delegate to it the calls to _pdf, _pmf etc.

EDIT: executing johnsonsu(a=2.55,b=2.55) creates a frozen distribution, rv_frozen. Don't do it unless you want to reuse the instance multiple times: just give the a,b shape parameters as arguments to the entropy function.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM