I'm trying to get the top 5 used words from a chunk of text. I have built up a map of words which includes a value of how many times the word has been used.
Map<String,Integer> wordHits = new HashMap<String,Integer>();
for(Status status3 : statuses){
String mdry = status3.getText();
String[] statusSplitOnSpace = mdry.split(" ");
for(String wordInStatus : statusSplitOnSpace){
for(String str : statusSplitOnSpace){
if(doesListContainWord(str)){
incrementKeyofWordInList(str);
}else{
if(doesWordCountAsAWord(str)){
addNewWordToList(str);
}
}
}
}
Set keys = list.keySet();
for (Iterator i = keys.iterator(); i.hasNext() ;){
String key = (String) i.next();
String value = (String) list.get(key);
//if(value.length()>10)
System.out.println("Word (" + key + ") was found " + value + " times.");
//else{
}
Assuming you have your words stored in an array, first I would transfer the words to a Map
. I believe you were trying to do that but it is hard to tell with your variable names. After you do this, you can create a custom Comparator
that you can utilize to sort your Map
. You can do something like this:
public class Solution {
public static void main(String[] args){
String[] words = {"word1", "word1", "word2", "word3", "word4", "word5", "word5"};
Map<String, Integer> wordCounts = new HashMap<>();
for (String word : words){ //Transfer your words to a map
if (wordCounts.containsKey(word)){ //If word is already in map, increase value
wordCounts.put(word, wordCounts.get(word)+1);
}else{ //If word is not in map, add it to the map
wordCounts.put(word, 1);
}
}
TreeMap<String, Integer> sortedWordCounts = new TreeMap<>(new ValueComparator(wordCounts)); //Sorts based off of counts
sortedWordCounts.putAll(wordCounts); //Add to new map
NavigableSet<String> keys = sortedWordCounts.descendingKeySet();
for (int i=0; i<5; i++){
System.out.println(keys.pollLast()); //This prints out the top 5 keys.
}
}
}
class ValueComparator implements Comparator<String>{
private Map<String,Integer> map;
public ValueComparator(Map<String,Integer> map){
this.map = map;
}
@Override
public int compare(String o1, String o2) {
if (map.get(o1)>=map.get(o2)){
return -1;
}else{
return 1;
}
}
}
Output
word5
word1
word4
word3
word2
A TreeMap
is a type of Map
but sorts the map for you depending on the Comparator
you initialize it with. If you do not give it a Comparator
it will just sort by the keys and we do not want that. We want to sort by the values, so you have to write your own Comparator
.
Here's a more novice level "manual" approach. I didn't test it, but it's got to be close...
// Get sorted Lists of words and counts from the source Map
List<String> sortedWordsList = new ArrayList<String>();
List<Integer> sortedCountsList = new ArrayList<Integer>();
for( String word : wordCountMap.keySet() )
{
Integer wordCount = wordCountMap.get(word);
int insertIndex=0;
for( int i=0; i != sortedCountsList.size(); ++i )
{
if( wordCount > sortedCountsList.get(i) ) break;
++insertIndex;
}
sortedWordsList.add( insertIndex, word );
sortedCountsList.add( insertIndex, wordCount );
}
// Move top 5 words into a new List
final int TOP_WORDS_TO_FIND_COUNT = 5;
List<String> topWordsList = new ArrayList<String>();
for( int i=0; i != sortedWordsList.size(); ++i )
{
topWordsList.add( i, sortedWordsList.get(i) );
if( i == TOP_WORDS_TO_FIND_COUNT-1 ) break;
}
// Move top 5 counts into a new List
List<Integer> topCountsList = new ArrayList<Integer>();
for( int i=0; i != sortedCountsList.size(); ++i )
{
topCountsList.add( i, sortedCountsList.get(i) );
if( i == TOP_WORDS_TO_FIND_COUNT-1 ) break;
}
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.