简体   繁体   中英

Updating TF-IDF using Gensim

Hi I'm using Gensim to find similarity between documents to do so I make TF-IDF of documents and calculate cosine similarity. when I have new document I can calculate similarity of this document with previous documents using index[tfidf[vec]] but in this way TF-IDF doesn't update and new words does not consider in similarity calculation is there any solution to update TF-IDF quickly without recalculating whole matrix or what is the best solution for my problem?

I think it's not possible. Because when you add a new document to the corpus, the vocabulary of TF-IDF will change, and when the vocabulary changes, all of the TF-IDF values will change too and the whole matrix should be recalculated. But this link may be helpful for you.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM