简体   繁体   中英

Little confusion about how tf-idf model is implemented in gensim

I am working on tf-idf model. I have little confusion as how this model is implemented. I have constructed model now when I am trying to print the model it is giving different value for the same term. For following two term are giving these result:

doc_bow = [(0, 1), (1, 1)]
val1= tf_idf_corpus[doc_bow] 

doc_bow = [(0,1)]
val2=tf_idf_corpus[doc_bow] 

Following is the result:

val1= [(0, 0.56486634414605663), (1, 0.82518241210720711)]
val2=[(0, 1.0)]

I am just curious to know, why tf-idf value of term 0 is 0.5648 in val1 and 1.0 in val2.

The documentation may help clear your confusion: http://radimrehurek.com/gensim/models/tfidfmodel.html

I am just curious to know, why tf-idf value of term 0 is 0.5648 in val1 and 1.0 in val2.

The output vectors are normalized to unit (euclidean) length. You can turn this off, using the normalize constructor parameter.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM