I am working on tf-idf model. I have little confusion as how this model is implemented. I have constructed model now when I am trying to print the model it is giving different value for the same term. For following two term are giving these result:
doc_bow = [(0, 1), (1, 1)]
val1= tf_idf_corpus[doc_bow]
doc_bow = [(0,1)]
val2=tf_idf_corpus[doc_bow]
Following is the result:
val1= [(0, 0.56486634414605663), (1, 0.82518241210720711)]
val2=[(0, 1.0)]
I am just curious to know, why tf-idf value of term 0 is 0.5648 in val1 and 1.0 in val2.
The documentation may help clear your confusion: http://radimrehurek.com/gensim/models/tfidfmodel.html
I am just curious to know, why tf-idf value of term 0 is 0.5648 in val1 and 1.0 in val2.
The output vectors are normalized to unit (euclidean) length. You can turn this off, using the normalize
constructor parameter.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.