Tensorflow：Word2vec CBOW模型

Question

I am new to tensorflow and to word2vec. 我是tensorflow和word2vec的新手。 I just studied the word2vec_basic.py which trains the model using Skip-Gram algorithm. 我刚学习了word2vec_basic.py ，它使用Skip-Gram算法训练模型。 Now I want to train using CBOW algorithm. 现在我想用CBOW算法训练。 Is it true that this can be achieved if I simply reverse the train_inputs and train_labels ? 如果我简单地反转train_inputs和train_labels ，这是否可以实现？

Answer 1

I think CBOW model can not simply be achieved by flipping the train_inputs and the train_labels in Skip-gram because CBOW model architecture uses the sum of the vectors of surrounding words as one single instance for the classifier to predict. 我认为CBOW模型不能简单地通过在Skip-gram翻转train_inputs和train_labels来实现，因为CBOW模型体系结构使用周围单词的向量之和作为分类器预测的单个实例。 Eg, you should use [the, brown] together to predict quick rather than using the to predict quick . 例如，你应该使用[the, brown]一起预测quick ，而不是使用the预测quick 。

To implement CBOW, you'll have to write a new generate_batch generator function and sum up the vectors of surrounding words before applying logistic regression. 要实现CBOW，您必须编写一个新的generate_batch器函数，并在应用逻辑回归之前总结周围单词的向量。 I wrote an example you can refer to: https://github.com/wangz10/tensorflow-playground/blob/master/word2vec.py#L105 我写了一个你可以参考的例子： https ： //github.com/wangz10/tensorflow-playground/blob/master/word2vec.py#L105

Answer 2

For CBOW, You need to change only few parts of the code word2vec_basic.py . 对于CBOW，您只需要更改代码word2vec_basic.py的几个部分。 Overall the training structure and method are the same. 总的来说，训练结构和方法是相同的。

Which parts should I change in word2vec_basic.py? 我应该在word2vec_basic.py中更改哪些部分？

1) The way it generates training data pairs. 1）它生成训练数据对的方式。 Because in CBOW, you are predicting the center word, not the context words. 因为在CBOW中，您正在预测中心词，而不是上下文词。

The new version for generate_batch will be generate_batch的新版本将是

def generate_batch(batch_size, bag_window):
  global data_index
  span = 2 * bag_window + 1 # [ bag_window target bag_window ]
  batch = np.ndarray(shape=(batch_size, span - 1), dtype=np.int32)
  labels = np.ndarray(shape=(batch_size, 1), dtype=np.int32)  
  buffer = collections.deque(maxlen=span)
  for _ in range(span):
    buffer.append(data[data_index])
    data_index = (data_index + 1) % len(data)
  for i in range(batch_size):
    # just for testing
    buffer_list = list(buffer)
    labels[i, 0] = buffer_list.pop(bag_window)
    batch[i] = buffer_list
    # iterate to the next buffer
    buffer.append(data[data_index])
    data_index = (data_index + 1) % len(data)
  return batch, labels

Then new training data for CBOW would be 那么CBOW的新训练数据就是

data: ['anarchism', 'originated', 'as', 'a', 'term', 'of', 'abuse', 'first', 'used', 'against', 'early', 'working', 'class', 'radicals', 'including', 'the']

#with bag_window = 1:
    batch: [['anarchism', 'as'], ['originated', 'a'], ['as', 'term'], ['a', 'of']]
    labels: ['originated', 'as', 'a', 'term']

compared to Skip-gram's data 与Skip-gram的数据相比

#with num_skips = 2 and skip_window = 1:
    batch: ['originated', 'originated', 'as', 'as', 'a', 'a', 'term', 'term', 'of', 'of', 'abuse', 'abuse', 'first', 'first', 'used', 'used']
    labels: ['as', 'anarchism', 'originated', 'a', 'term', 'as', 'a', 'of', 'term', 'abuse', 'of', 'first', 'used', 'abuse', 'against', 'first']

2) Therefore you also need to change the variable shape 2）因此您还需要更改变量形状

train_dataset = tf.placeholder(tf.int32, shape=[batch_size])

to 至

train_dataset = tf.placeholder(tf.int32, shape=[batch_size, bag_window * 2])

3) loss function 3）损失功能

 loss = tf.reduce_mean(tf.nn.sampled_softmax_loss(
  weights = softmax_weights, biases = softmax_biases, inputs = tf.reduce_sum(embed, 1), labels = train_labels, num_sampled= num_sampled, num_classes= vocabulary_size))

Notice inputs = tf.reduce_sum(embed, 1) as Zichen Wang mentioned it. 注意输入= tf.reduce_sum（embed，1），正如Zichen Wang所提到的那样。

This is it! 就是这个！

Answer 3

Basically, yes: 基本上，是的：

for the given text the quick brown fox jumped over the lazy dog: , the CBOW instances for window size 1 would be 对于给定的文本the quick brown fox jumped over the lazy dog: ，窗口大小为1的CBOW实例将是

([the, brown], quick), ([quick, fox], brown), ([brown, jumped], fox), ...

Tensorflow：Word2vec CBOW模型

问题描述

3 个解决方案

解决方案1
13 2016-06-21 15:07:57

解决方案2
9 2017-07-09 04:41:44

解决方案3
6 2016-05-24 02:16:28

Tensorflow：Word2vec CBOW模型

问题描述

3 个解决方案

解决方案1 13 2016-06-21 15:07:57

解决方案2 9 2017-07-09 04:41:44

解决方案3 6 2016-05-24 02:16:28

解决方案1
13 2016-06-21 15:07:57

解决方案2
9 2017-07-09 04:41:44

解决方案3
6 2016-05-24 02:16:28