[英]How to count different elements in Vector using java?
I have a lot of words at hand. 我手头上有很多话。 What I need to do is to save them and count every different word.
我需要做的是保存它们并计算每个不同的词。 The original data may contain some duplicate words.Firstly, I want to use Set, then I can guarantee that I only get the different wrods.
原始数据可能包含一些重复的单词。首先,我想使用Set,然后可以保证只得到不同的结果。 But how can I count their times?
但是我该如何计算他们的时间呢? Is there someone having any "clever" idea?
有人有“聪明”的主意吗?
You can use MultiSet
from the Guava library. 您可以从Guava库中使用
MultiSet
。
http://docs.guava-libraries.googlecode.com/git/javadoc/com/google/common/collect/Multiset.html http://docs.guava-libraries.googlecode.com/git/javadoc/com/google/common/collect/Multiset.html
You can use Map to solve this problem. 您可以使用地图解决此问题。
String sample = " I have a problem here. I have a lot of words at hand. What I need to do is to save them and count every different word. The original data may contains duplicate words.Firstly, I want to use Set, then I can guarantee that I only get the different wrods. But how can I count their times? Is there someone having any clever idea?";
String[] array = sample.split("[\\s\\.,\\?]");
Map<String,Integer> statistic = new HashMap<String,Integer>();
for (String elem:array){
String trimElem = elem.trim();
Integer count = 0;
if(!"".equals(trimElem)){
if(statistic.containsKey(trimElem)){
count = statistic.get(trimElem);
}
count++;
statistic.put(trimElem,count);
}
}
也许您可以使用哈希,在Java中,它可以是HashMap(或HashSet?),您可以哈希每个单词,如果该单词已被哈希,则将与其关联的某个值加1。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.