简体   繁体   English

使用 Numpy einsum 在 Python 中向量化函数

[英]Vectorizing a function in Python using Numpy einsum

I'm trying to use Numpy to vectorize a function for calculating the average marginal effects of the features in a multinomial logit model fitted using scikit-learn.我正在尝试使用 Numpy 对函数进行矢量化,以计算使用 scikit-learn 拟合的多项 logit 模型中特征的平均边际效应 I have managed to do the calculations using for loops , which looks like this我已经设法使用for 循环进行计算,看起来像这样

#Get probabilities for each obs i belonging to class j. shape =  N * J
probas = fitted_model.predict_proba(X)
#Get coefficients. Shape j_classes * k_coefficients
betas = fitted_model.coef_

J = probas.shape[1]
N = probas.shape[0]
K = betas.shape[1]

avg_margins = np.zeros([K, J])
        
for j in tqdm(range(J)):
    for k in range(K):
        dydw = 0
        for i in range(N):
            dydw += probas[i,j] * (betas[j,k] - np.dot(probas[i,:],betas[:,k]))
        avg_margins[k,j] = 1 / N * dydw

However, this is very slow, why I want someway of getting rid of the loops.但是,这很慢,为什么我想以某种方式摆脱循环。 I'm a beginner with both Numpy and linear algebra so please bare with me, but I think my best bet is using numpy einsum and where I've gotten so far is this我是 Numpy 和线性代数的初学者,所以请和我一起学习,但我认为我最好的选择是使用numpy einsum ,到目前为止我得到的就是这个

avg_margins = 1 / N * np.einsum('ij, jk -> kj', probas,  betas - np.einsum('im, mk -> k', probas, betas))

This unfortunately returns the wrong results and I'm unsure where I'm going wrong.不幸的是,这返回了错误的结果,我不确定我哪里出错了。 Any help or hints of where I'm going wrong would be highly appreciated!任何有关我哪里出错的帮助或提示将不胜感激!

No einsum is needed here:这里不需要 einsum:

diff = betas[:,None] - np.dot(probas, betas)
avg_margins = np.sum(probas * diff.T, axis=1) / N

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM