I am now working on a project using Gensim.word2vec, and I am a total freshman for this field.
Actually I already got a model. Are there any way that I can get the similarity rank of a word for another word. For example, the top 2 most similar words for the word 'girl' is 'lady' and then 'woman'. Are there any functions I can use if i enter 'lady' is can return 1, if i enter 'woman' it can return 2?
Thanks!
There's no gensim API for this, but you can use basic Python code to find which position (if any) a word appears in a longer sequence – such as the list of results given by gensim's most_similar()
.
For example:
origin_word = 'apple'
query_word = 'orange'
all_sims = w2v_model.most_similar(origin_word, topn=0) # topn=0 gets all results
query_index = -1
for i, sim_tuple in enumerate(all_sims):
if sim_tuple[0] == query_word:
query_index = i
break
At the end of this code, query_index
will either be the (0-based) position of 'orange'
in the list-of-all-similars, or -1
if not found.
Note that the most expensive step is the creation of the all_sims
ordered-list of all similar words; if you are going to be checking the ranks of multiple query words against one origin word, you'd definitely want to keep the all_sims
around rather than re-compute it each time.
In fact, if you were sure you were going to do lots of such lookups, potentially down through the very-deepest words, you might do a single pass to change the results into a dict:
word_to_sims_index = {}
for i, sim_tuple in enumerate(all_sims):
word_to_sims_index[i] = sim_tuple[0]
After that, finding the index of a word would be a (quick constant-time) dict lookup...
query_index = word_to_sims_index[query_word]
...that will throw a KeyError if the query word isn't in the dict. (You could use word_to_sims_index.get(query_word, -1)
if you instead wanted a default -1
response when the key is not present.)
我认为这是重复的,正如他们在另一个答案中所说,您可以使用model.rank('girl', 'lady')==1
。
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.