Pytorch 相當於 tensorflow keras StringLookup？

Question

我現在正在使用 pytorch，但我缺少一個層： tf.keras.layers.StringLookup ，它有助於處理 ids。 有什么解決方法可以用 pytorch 做類似的事情嗎？

我正在尋找的功能示例：

vocab = ["a", "b", "c", "d"]
data = tf.constant([["a", "c", "d"], ["d", "a", "b"]])
layer = tf.keras.layers.StringLookup(vocabulary=vocab)
layer(data)

Outputs:
<tf.Tensor: shape=(2, 3), dtype=int64, numpy=
array([[1, 3, 4],
       [4, 1, 2]])>

Answer 1

您可以使用Collections.Counter和torchtext的vocab對象從您的詞匯表中構建查找功能。 然后，您可以輕松地將序列傳遞給它，並將它們的編碼作為張量：

from torchtext.vocab import vocab
from collections import Counter

tokens = ["a", "b", "c", "d"]
samples = [["a", "c", "d"], ["d", "a", "b"]]

# Build string lookup
lookup = vocab(Counter(tokens))

>>> torch.tensor([lookup(s) for s in samples])
tensor([[0, 2, 3],
        [3, 0, 1]])

Answer 2

您可以使用庫 torchtext，只需使用 python3 -m pip install torchtext 安裝它

你可以是這樣的：

from torchtext.vocab import vocab
from collections import OrderedDict

tokens = ['a', 'b', 'c', 'd']
v1 = vocab(OrderedDict([(token, 1) for token in tokens]))
v1.lookup_indices(["a","b","c"])

這是結果：

([0, 1, 2],)

Answer 3

包 torchnlp，

pip install pytorch-nlp

from torchnlp.encoders import LabelEncoder

data = ["a", "c", "d", "e", "d"]
encoder = LabelEncoder(data, reserved_labels=['unknown'], unknown_index=0)

enl = [encoder.encode(x) for x in data]

print(enl)

[tensor(1), tensor(2), tensor(3), tensor(4), tensor(3)]

Pytorch 相當於 tensorflow keras StringLookup？

問題描述

3 個解決方案

解決方案1
1 2021-10-22 11:43:05

解決方案2
0 2021-10-19 23:11:22

解決方案3
0 2021-10-21 07:15:46

Pytorch 相當於 tensorflow keras StringLookup？

問題描述

3 個解決方案

解決方案1 1 2021-10-22 11:43:05

解決方案2 0 2021-10-19 23:11:22

解決方案3 0 2021-10-21 07:15:46

解決方案1
1 2021-10-22 11:43:05

解決方案2
0 2021-10-19 23:11:22

解決方案3
0 2021-10-21 07:15:46