简体   繁体   中英

need to count the frequency of each terms inside a document

i need to calculate the frequency of all the terms inside a document. How can i do that ? i do not ask for codes. I am just asking for guidance. Actually i am doing some similarity calculation between a document and query. I have calculated the term frequency for the query. But i do not know how to calculate the tern frequency for EACH words inside a document. Can anyone guide me ? Thank you for your attention.

You can use a HashMap, where key is your term and value - the frequency of it. Each time you see you term you increase the value. After the file is done you have your numbers.

是的,使用HashMap保存值并浏览文件,您可以使用扫描仪

In Java you should definitely stay with HashMap<String, Integer> . The terms will be the HashMap keys and the term frequency the value.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM