What is the use of use_idf parameter do in tfidf Vectorizer? Documentation doesn't give much explantion about it. can someone explain it?
If use_idf
is set to True
(which is the default), then inverse document frequency is taken into account during transformation. What this causes is that tokens that appear in a lot of documents will be automatically deemed less informative than those that appear in fewer documents.
If you set it to False
, only term-frequency (count of words in a document) is used.
Check this good explanation on Wikipedia .
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.