简体   繁体   中英

Counting number of occurrences is PyTorch Tensor. (Tensor is too big for Numpy)

Is there any smart way to count the number of occurrences of each value in a very Large PyTorch Tensor? Tensor Size is 11701*300=3510300 or maybe increase or decrease. TORCH.BINCOUNT, TORCH.UNIQUE and TORCH.UNIQUE_CONSECUTIVE are not useful so far.

BINCOUNT returns a different number of elements every time. Unique is also not useful as it returns unique values.

在此处输入图像描述

This is what I meant when I said it returns different elements every time. If 5 elements will return 8 elements tensor, How I am supposed to know which elements are how many times. this is confusing for me. The official documentation has limited content and there is no other website, explains it.

In the above picture. So, 5 is 2 times. 0 is? what 0 times? How to read this output. it doesn't make any sense to me.

在此处输入图像描述

Actually the problem is how you read the output. The output of torch.bincount is a tensor of size max(input)+1 , that means it covers all bins of size 1 from zero to max(input) . Therefore, in the output tensor from the first element you see how many 0, 1, 2, ..., max(input) are there in your non-negative integral array.

For example:

t1 = torch.randint(0,10, (20,))
print(t1)

tensor([2, 5, 7, 3, 1, 2, 7, 8, 8, 0, 5, 6, 4, 4, 4, 6, 3, 0, 6, 6])

in this tensor the max value is 8 (9 did not appear by chance), so it gives:

print(torch.bincount(t1).size())
print(torch.bincount(t1))

torch.Size([9])
tensor([2, 1, 2, 2, 3, 2, 4, 2, 2])

that means, in the tensor t1 there are two 0s, one 1, two 3s, ..., and two 8s.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM