如何在Tensorflow中使用預訓練的Word2Vec模型

Question

我有一個Word2Vec模型，在Gensim訓練。 如何在Tensorflow使用它進行Word Embeddings 。 我不想在Tensorflow中從頭開始訓練嵌入。 有人可以告訴我如何使用一些示例代碼嗎？

Answer 1

假設你有一個字典和inverse_dict列表，列表中的索引對應於最常見的單詞：

vocab = {'hello': 0, 'world': 2, 'neural':1, 'networks':3}
inv_dict = ['hello', 'neural', 'world', 'networks']

注意inverse_dict索引如何對應於字典值。 現在聲明你的嵌入矩陣並獲取值：

vocab_size = len(inv_dict)
emb_size = 300 # or whatever the size of your embeddings
embeddings = np.zeroes((vocab_size, emb_size))

from gensim.models.keyedvectors import KeyedVectors                         
model = KeyedVectors.load_word2vec_format('embeddings_file', binary=True)

for k, v in vocab.items():
  embeddings[v] = model[k]

你有嵌入矩陣。 好。 現在讓我們假設你想訓練樣本： x = ['hello', 'world'] 。 但這對我們的神經網絡不起作用。 我們需要整合：

x_train = []
for word in x:  
  x_train.append(vocab[word]) # integerize
x_train = np.array(x_train) # make into numpy array

現在我們很高興能夠即時嵌入我們的樣品

x_model = tf.placeholder(tf.int32, shape=[None, input_size])
with tf.device("/cpu:0"):
  embedded_x = tf.nn.embedding_lookup(embeddings, x_model)

現在embedded_x進入你的卷積或其他什么。 我也假設你沒有重新訓練嵌入，只是簡單地使用它們。 希望有所幫助

如何在Tensorflow中使用預訓練的Word2Vec模型

問題描述

1 個解決方案

解決方案1
10 2017-03-28 19:45:55

如何在Tensorflow中使用預訓練的Word2Vec模型

問題描述

1 個解決方案

解決方案1 10 2017-03-28 19:45:55

解決方案1
10 2017-03-28 19:45:55