簡體   English   中英

與 numpy ndarray 的中位數絕對偏差

[英]Median absolute deviation from numpy ndarray

我使用 4D numpy 數組,我在其中計算統計數據mean, meadin, std沿數組的第 3 個維度,如下所示:

import numpy as np
input_shape = (1, 10, 4)
n_sample =20
X = np.random.uniform(0,1, (n_sample,)+input_shape)
X.shape
(20, 1, 10, 4)

然后我以這種方式計算mean, med,std-dev

sta_fuc = (np.mean, np.median, np.std)
stat = np.concatenate([func(X, axis=2, keepdims=True) for func in sta_fuc], axis=2)

以便:

stat.shape
(20, 1, 3, 4)

表示沿該維度的mean, medianstd的值。

但后來我想添加該列的平均絕對偏差mad的值,以便統計數據是( mean, median, std, mad ),但看起來numpy沒有為此提供 function。 如何將mad添加到我的統計數據中?

編輯

至於第一個答案,使用定義的 function,即:

def mad(arr, axis=None, keepdims=True):
    median = np.median(arr, axis=axis, keepdims=True)
    mad = np.median(np.abs(arr-median, axis=axis, keepdims=keepdims),
                    axis=axis, keepdims=keepdims)
    return mad

然后將mad添加到統計數據中,這會產生錯誤,如下所示:

sta_fuc = (np.mean, np.median, np.std, mad)
stat = np.concatenate([func(X, axis=2, keepdims=True) for func in sta_fuc], axis=2)

---------------------------------------------------------------------------

TypeError                                 Traceback (most recent call last)

<ipython-input-22-dab51665f952> in <module>()
      1 sta_fuc = (np.mean, np.median, np.std, mad)
----> 2 stat = np.concatenate([func(X, axis=2, keepdims=True) for func in sta_fuc], axis=2)

1 frames

<ipython-input-21-84d735c8c516> in mad(arr, axis, keepdims)
      1 def mad(arr, axis=None, keepdims=True):
      2     median = np.median(arr, axis=axis, keepdims=True)
----> 3     mad = np.median(np.abs(arr-median, axis=axis, keepdims=keepdims),
      4                     axis=axis, keepdims=keepdims)
      5     return mad

TypeError: 'axis' is an invalid keyword to ufunc 'absolute'

編輯-2

使用scipy建議的 scipy function 也會產生如下錯誤: from scipy.stats import median_absolute_deviation as mad

sta_fuc = (np.mean, np.median, np.std, mad)
stat = np.concatenate([func(X, axis=2, keepdims=True) for func in sta_fuc], axis=2)

TypeError: median_absolute_deviation() got an unexpected keyword argument 'keepdims'

通常,我看到 MAD 指的是中位數絕對偏差。 如果這是您想要的,它可以在 SciPy 庫中作為scipy.stats.median_absolute_deviation()

自己編寫一個合適的 function 也很容易。

編輯:這是一個 MAD function,它帶有一個keepdims參數:

def mad(data, axis=None, scale=1.4826, keepdims=False):
    """Median absolute deviation (MAD).
    
    Defined as the median absolute deviation from the median of the data. A
    robust alternative to stddev. Results should be identical to
    scipy.stats.median_absolute_deviation(), which does not take a keepdims
    argument.

    Parameters
    ----------
    data : array_like
        The data.
    scale : float, optional
        Scaling of the result. By default, it is scaled to give a consistent
        estimate of the standard deviation of values from a normal
        distribution.
    axis : numpy axis spec, optional
        Axis or axes along which to compute MAD.
    keepdims : bool, optional
        If this is set to True, the axes which are reduced are left in the
        result as dimensions with size one.

    Returns
    -------
    ndarray
        The MAD.
    """
    # keep dims here so that broadcasting works
    med = np.median(data, axis=axis, keepdims=True)
    abs_devs = np.abs(data - med)
    return scale * np.median(abs_devs, axis=axis, keepdims=keepdims)

我不知道使用 numpy 的內置解決方案。但是您可以使用mad = median(abs(a - median(a)))很容易地基於 numpy 函數實現它。

def mad(arr, axis=None, keepdims=True):
    median = np.median(arr, axis=axis, keepdims=True)
    mad = np.median(np.abs(arr-median),axis=axis, keepdims=keepdims)
    return mad

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM