給定 PyTorch 中的張量，如何構建邊際概率向量

Question

給定 PyTorch 中的張量，如何構建邊際概率向量

我有一個形狀為 [Dim1: <128>, Dim2: <64>] 的張量“A”，Dim1 中的每個元素都是從未知分布中提取的，我需要檢查其他 128 個樣本中是否出現過 Dim2 向量. 如果有，則該元素的邊際概率增加 1 並記錄在另一個形狀為 [DimB: <128>] 的張量“B”中。 迭代完成后，我將 B 中的所有元素除以 128（可能性的數量）以實現加權增量，因此目標是隨着 Dim1 大小的增加接近真實分布。

這怎么能直接在PyTorch中實現呢？ 我嘗試使用有序詞典，但它太慢了。 我假設存在一種方法可以直接在 PyTorch

如果我們有一個形狀為 [Dim1: <6>, Dim2: <3>] 的張量 T1，我使用有序字典的粗略方法：

from collections import OrderedDict
od = OrderedDict()
T1 = torch.Tensor([[2.4, 5.5,1],
                   [3.44,5.43,1],
                   [2.4, 5.5,1],
                   [3.44,8.43,1],
                   [3.44,5.43,9],
                   [3.44,5.43,1], ])
print ('T1 shape',T1.shape) # -> T1 shape torch.Size([6, 3]) 
for i in range(T1.shape[0]):
    key = ''.join([ str(int(j)) for j in T1[i].tolist()]) # creates a unique identifier (is there a better way to do this?)
    if key in od:
        od[key] +=1
        key_place_holder = key + str(od[key]) # unique identifier if we found duplicate to keep a 0 in the final tensor
        od[key_place_holder] = 0
    else:
        od[key] = 1
print ('len od',len(od)) # -> len od 6
list_values = [j/len(od) for i,j in od.items()] 
T1_marginal_probabilities = torch.Tensor(list_values)
print ('Marginal Probs',T1_marginal_probabilities) # -> Marginal Probs tensor([0.3333, 0.3333, 0.0000, 0.1667, 0.1667, 0.0000])

最終的 output 符合預期，因為 [2.4, 5.5,1] 和 [3.44,5.43,1] 的概率都是 2/6，因為我們在 position 0 和 2EZ 0 重復了 2 次 [2.4, 5.5,1] . 而 [3.44,5.43,1] 在 position 1 和 5 中重復。

Answer 1

您可以使用torch.unique和torch.nonzero ：

T1 = ...
values, inverse, counts = T1.unique(dim=1, return_inverse=True, return_counts=True)

ps = torch.zeros(inverse.numel())
for i, (v, c) in enumerate(zip(values, counts)):
    first_occurence = torch.nonzero(inverse == i)[0].item()
    ps[first_occurence] = c
ps /= ps.sum()

給定 PyTorch 中的張量，如何構建邊際概率向量

問題描述

1 個解決方案

解決方案1
0 2022-01-20 14:55:28

給定 PyTorch 中的張量，如何構建邊際概率向量

問題描述

1 個解決方案

解決方案1 0 2022-01-20 14:55:28

解決方案1
0 2022-01-20 14:55:28