简体   繁体   中英

Getting the parameter names of scipy.stats distributions

I am writing a script to find the best-fitting distribution over a dataset using scipy.stats. I first have a list of distribution names, over which I iterate:

dists = ['alpha', 'anglit', 'arcsine', 'beta', 'betaprime', 'bradford', 'norm']
for d in dists:
    dist = getattr(scipy.stats, d)
    ps = dist.fit(selected_data)
    errors.loc[d,['D-Value','P-Value']] = kstest(selected.tolist(), d, args=ps)
    errors.loc[d,'Params'] = ps

Now, after this loop, I select the minimum D-Value in order to get the best fitting distribution. Now, each distribution returns a specific set of parameters in ps, each with their names and so on (for instance, for 'alpha' it would be alpha, whereas for 'norm' they would be mean and std).

Is there a way to get the names of the estimated parameters in scipy.stats?

Thank you in advance

This code demonstrates the information that ev-br gave in his answer in case anyone else lands here.

>>> from scipy import stats
>>> dists = ['alpha', 'anglit', 'arcsine', 'beta', 'betaprime', 'bradford', 'norm']
>>> for d in dists:
...     dist = getattr(scipy.stats, d)
...     dist.name, dist.shapes
... 
('alpha', 'a')
('anglit', None)
('arcsine', None)
('beta', 'a, b')
('betaprime', 'a, b')
('bradford', 'c')
('norm', None)

I would point out that the shapes parameter yields a value of None for distributions such as the normal which are parameterised by location and scale.

Warren Weckesser and I have developed a more robust solution:

import sys
import scipy.stats

def list_parameters(distribution):
    """List parameters for scipy.stats.distribution.
    # Arguments
        distribution: a string or scipy.stats distribution object.
    # Returns
        A list of distribution parameter strings.
    """
    if isinstance(distribution, str):
        distribution = getattr(scipy.stats, distribution)
    if distribution.shapes:
        parameters = [name.strip() for name in distribution.shapes.split(',')]
    else:
        parameters = []
    if distribution.name in scipy.stats._discrete_distns._distn_names:
        parameters += ['loc']
    elif distribution.name in scipy.stats._continuous_distns._distn_names:
        parameters += ['loc', 'scale']
    else:
        sys.exit("Distribution name not found in discrete or continuous lists.")
    return parameters

The discussion can be found here .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM