简体   繁体   English

计算文本文件中唯一词的数量

[英]Count amount of unique words from a text file

I have a 40 000 ish text file. 我有一个40 000 ish文本文件。 All the words from the text file are saved in an ArrayList. 文本文件中的所有单词都保存在ArrayList中。

I want to find how many unique words there are in that file and return that value to the main class. 我想查找该文件中有多少个唯一的单词,然后将该值返回给主类。 So if there is a unique word the counter goes up by one. 因此,如果有一个唯一的单词,计数器将增加一个。

I would like the output to be 我希望输出是

   Amount of unique words: 7000

I tried 我试过了

       public int antallOrd() {
          Set<Ord> unik = new HashSet<Ord>(ordListe) ;
            for (Ord unikt : unik) {
            System.out.println(nokkel + ": " + Collections.frequency(ordListe, nokkel));
       }

but didnt quite understand how to implement a counter to this 但不太了解如何实施与此相反的措施

thanks in advance 提前致谢

You don't have to iterate through unik - it is a set, and putting all the words in that set removed the duplicates. 您不必遍历unik它是一个集合,将所有单词放在该集合中将删除重复项。 The size of unik is the answer to your question. unik的大小就是您问题的答案。

Put the words into a java.util.Bag and print the size() of the bag. 将这些单词放入java.util.Bag中并打印包的size()。

You could also use Hashtable keyed on the word if you wanted to keep a count of each word. 如果您想对每个单词进行计数,也可以在该单词上使用散列表键。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM