![](/img/trans.png)
[英]Compute mean squared, absolute deviation and custom similarity measure - Python/NumPy
[英]Custom function to compute mean absolute deviation
我有一個類似於此的 4D numpy 數組:
>>>import numpy as np
>>>from functools import partial
>>>X = np.random.rand(20, 1, 10, 4)
>>>X.shape
(20, 1, 10, 4)
我計算以下統計數據mean, median, std, p25, p75
>>>percentiles = tuple(partial(np.percentile, q=q) for q in (25,75))
>>>stat_functions = (np.mean, np.std, np.median) + percentiles
>>>stats = np.concatenate([func(X, axis=2, keepdims=True) for func in stat_functions], axis=2)
以便:
>>>stats.shape
(20, 1, 5, 4)
>>>stats[0]
array([[[0.55187202, 0.55892688, 0.45816177, 0.6378181 ],
[0.31028278, 0.32109677, 0.17319351, 0.13341651],
[0.57112019, 0.60587194, 0.45490572, 0.59787335],
[0.30857011, 0.30367621, 0.28899686, 0.55742753],
[0.80678815, 0.82014851, 0.61295181, 0.70529412]]])
我對統計數據中的mad
感興趣,所以我定義了這個 function,因為它不適用於 numpy。
def mad(data):
mean = np.mean(data)
f = lambda x: abs(x - mean)
vf = np.vectorize(f)
return (np.add.reduce(vf(data))) / len(data)
但是我在讓這個 function 工作時遇到問題:首先我嘗試了:
>>>stat_functions = (np.mean, np.std, np.median, mad) + percentiles
>>>stats = np.concatenate([func(X, axis=2, keepdims=True) for func in stat_functions], axis=2)
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-33-fa6d972f0fce> in <module>()
----> 1 stats = np.concatenate([func(X, axis=2, keepdims=True) for func in stat_functions], axis=2)
<ipython-input-33-fa6d972f0fce> in <listcomp>(.0)
----> 1 stats = np.concatenate([func(X, axis=2, keepdims=True) for func in stat_functions], axis=2)
TypeError: mad() got an unexpected keyword argument 'axis'
然后我將mad
的定義修改為:
def mad(data, axis=None):
...
進入這個問題:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-35-c74d9e3d057b> in <module>()
----> 1 stats = np.concatenate([f(X, axis=2, keepdims=True) for f in my_func], axis=2)
<ipython-input-35-c74d9e3d057b> in <listcomp>(.0)
----> 1 stats = np.concatenate([f(X, axis=2, keepdims=True) for f in my_func], axis=2)
TypeError: mad() got an unexpected keyword argument 'keepdims'
所以也這樣做:
def mad(data, axis=None, keepdims=None):
...
讓我陷入:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-6-c74d9e3d057b> in <module>()
----> 1 stats = np.concatenate([f(X, axis=2, keepdims=True) for f in my_func], axis=2)
<__array_function__ internals> in concatenate(*args, **kwargs)
ValueError: all the input arrays must have same number of dimensions, but the array at index 0 has 4 dimension(s) and the array at index 3 has 3 dimension(s)
我知道這與維度問題有關,但我不確定在這種情況下如何解決它。
*編輯:
根據給出的答案,在使用mad
的 function 答案后,我得到了一個奇怪的結果,如下所示:
stat_functions = (np.mean, np.std, np.median,mad) + percentiles
stats = np.concatenate([func(X, axis=2, keepdims=True) for func in stat_functions], axis=2)
stats.shape
(20, 1, 15, 4)
預期的 output 應該具有(20,1,6,4)
的形狀,因為我在第三維中添加了一個統計值: (np.mean, np.std, np.median, mad) + percentiles
編輯-2
從答案中使用這個 function:
def mad(data, axis=-1, keepdims=True):
return np.abs(data - data.mean(axis, keepdims=True)).mean(axis)
接着:
stat_functions = (np.mean, np.std, np.median, mad) + percentiles
stats = np.concatenate([func(X, axis=2, keepdims=True) for func in stat_functions], axis=2)
然后遇到這個:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-16-fa6d972f0fce> in <module>()
----> 1 stats = np.concatenate([func(X, axis=2, keepdims=True) for func in stat_functions], axis=2)
<__array_function__ internals> in concatenate(*args, **kwargs)
ValueError: all the input arrays must have same number of dimensions, but the array at index 0 has 4 dimension(s) and the array at index 3 has 3 dimension(s)
我注意到您的代碼vf
的第一件事絕不是矢量化 function(請參閱Numpy 文檔中的注釋。您可以只使用np.abs
而不是abs
並且您的 function 將被矢量化。
也就是說,您的 function 可以寫成:
def mad(data):
return np.abs(data - data.mean(0))/ len(data)
現在,請注意這個mad
的 function,或者你原來的 function,只接受一個位置參數,沒有可選參數。 你得到的錯誤是因為你試圖將axis=2
傳遞給mad
:
[func(X, axis=2, keepdims=True) for func in stat_functions]
要解決此問題,請使用可選參數構建 function:
def mad(data, axis=-1, keepdims=True):
return np.abs(data - data.mean(axis, keepdims=keepdims)).sum(axis)/len(data)
或者使用mean(axis)
比sum(axis)/len(data)
更有意義
def mad(data, axis=-1, keepdims=True):
return np.abs(data - data.mean(axis, keepdims=True)).mean(axis)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.