I would like to replace a specific Doc2Vec vector created by a Doc2vec model with another one, with different weights.
These are the weights of existing vector (just some of the 800 real weights):
array([ 1.72976881e-01, 2.44364753e-01, -9.90936995e-01, -1.03020036e+00,
-1.41046381e+00, 1.00970473e-02, -1.84546992e-01, 3.77230316e-01,
9.20825064e-01, -2.61079431e-01, 7.51454890e-01, -1.15353882e+00,
-9.96422302e-03, 1.65010715e+00, 5.63869551e-02, -4.25169647e-01],
dtype=float32)
I'd like to replace them with these ones:
array([ 1.54585496e-01, 2.22857013e-01, -8.88102770e-01, -9.27794874e-01,
-1.27402091e+00, -5.38651831e-04, -1.63646400e-01, 3.38727772e-01,
8.28402698e-01, -2.29774594e-01, 6.77914560e-01, -1.04013634e+00,
-1.37407500e-02, 1.48667252e+00, 5.83136305e-02, -3.88587236e-01]
dtype=float32)
I tried to add a new vector to my model with this code:
model = gensim.models.Word2Vec.load('mymodel.doc2vec')
model.docvecs.add(entities=["88763"], weights=[new_vector])
I'm not getting any error, still when I call back that "88763" vector I see that it hasn't been updated:
model.docvecs["88763"]
array([ 1.72976881e-01, 2.44364753e-01, -9.90936995e-01, -1.03020036e+00,
-1.41046381e+00, 1.00970473e-02, -1.84546992e-01, 3.77230316e-01,
9.20825064e-01, -2.61079431e-01, 7.51454890e-01, -1.15353882e+00,
-9.96422302e-03, 1.65010715e+00, 5.63869551e-02, -4.25169647e-01],
dtype=float32)
Could someone please help me in some way?
Thanks.
Don't load a Doc2Vec
model with `Word2Vec'. So load it instead with:
model = gensim.models.Doc2Vec.load('mymodel.doc2vec')
Once loaded, you should be able to modify any existing entry via direct assignment to a bracket-accessed entry, eg:
model.docvecs['88763'] = new_vector
(You would chiefly use add()
to add vectors for keys that aren't already there. But it might also work to replace existing vectors in a batch if you supply the non-default replace=True
parameter in addition to the list-of-entities and list-of-vectors.)
Update: The above is supposed to work, but there's a pending bug at the moment (November 2019, gensim-3.8.1
) where it won't.
In the meantime, to modify one specific existing vector, you can act on the raw vectors_docs
property, and look up the index-position to change yourself. For example:
slot = model.docvecs.int_index('88763',
model.docvecs.doctags,
model.docvecs.max_rawint)
model.docvecs.vectors_docs[slot] = new_vector
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.