简体   繁体   中英

tf-idf vectorizer's use_idf parameter explanation

What is the use of use_idf parameter do in tfidf Vectorizer? Documentation doesn't give much explantion about it. can someone explain it?

If use_idf is set to True (which is the default), then inverse document frequency is taken into account during transformation. What this causes is that tokens that appear in a lot of documents will be automatically deemed less informative than those that appear in fewer documents.

If you set it to False , only term-frequency (count of words in a document) is used.

Check this good explanation on Wikipedia .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM