pytorch：對 2 個不同大小的張量執行操作的有效方法，其中一個具有一對多關系

Question

我有 2 個張量。 第一個張量是一維的（例如 3 個值的張量）。 第二個張量是二維的，第一個暗淡的作為第一個張量的 IDs 在一對多關系中（例如，形狀為 6、2 的張量）

# e.g. simple example of dot product
import torch

a = torch.tensor([2, 4, 3])
b = torch.tensor([[0, 2], [0, 3], [0, 1], [1, 4], [2, 3], [2, 1]]) # 1st column is the index to tensor a, 2nd column is the value

output = [(2*2)+(2*3)+(2*1),(4*4),(3*3)+(3*1)]
output = [12, 16, 12]

目前我所擁有的是找到 b 中每個 id 的大小（例如 [3,1,2]），然后使用 torch.split 將它們分組到張量列表中並在這些組中運行 for 循環。 對於小張量來說還好，但是當張量的大小達到數百萬，並且有數萬個任意大小的組時，它變得非常慢。

有更好的解決方案嗎？

Answer 1

您可以使用numpy.bincount或torch.bincount按鍵對b的元素求和：

import numpy as np

a = np.array([2,4,3])
b = np.array([[0,2], [0,3], [0,1], [1,4], [2,3], [2,1]])

print( np.bincount(b[:,0], b[:,1]) )
# [6. 4. 4.]

print( a * np.bincount(b[:,0], b[:,1]) )
# [12. 16. 12.]

import torch

a = torch.tensor([2,4,3])
b = torch.tensor([[0,2], [0,3], [0,1], [1,4], [2,3], [2,1]])

torch.bincount(b[:,0], b[:,1])
# tensor([6., 4., 4.], dtype=torch.float64)

a * torch.bincount(b[:,0], b[:,1])
# tensor([12., 16., 12.], dtype=torch.float64)

參考：

Answer 2

如果需要漸變，pytorch 中的另一種選擇。

import torch

a = torch.tensor([2,4,3])
b = torch.tensor([[0,2], [0,3], [0,1], [1,4], [2,3], [2,1]])

output = torch.zeros(a.shape[0], dtype=torch.long).index_add_(0, b[:, 0], b[:, 1]) * a

或者， torch.tensor.scatter_add 也可以。

pytorch：對 2 個不同大小的張量執行操作的有效方法，其中一個具有一對多關系

問題描述

2 個解決方案

解決方案1
1 2023-01-12 22:06:49

解決方案2
1 2023-01-16 18:23:44

pytorch：對 2 個不同大小的張量執行操作的有效方法，其中一個具有一對多關系

問題描述

2 個解決方案

解決方案1 1 2023-01-12 22:06:49

解決方案2 1 2023-01-16 18:23:44

解決方案1
1 2023-01-12 22:06:49

解決方案2
1 2023-01-16 18:23:44