简体   繁体   中英

I'm making a word frequency counter and it's returning random words

I'm working on creating a program that will take an input text file and will print out the 10 most commonly used words and how many times they are used. However, it currently prints 10 random words, not ordered. Is there anything that I am missing?

    public void insert(E word) {
    if (word.equals("")) {
        return;
    }
    //Adds 2 temporary nodes, and sets first to the first one if first is empty
    Node temp = new Node(word);
    Node temp2;
    if (first == null) {
        first = temp;
    } else{
    for (Node temp6 = first; temp6 != null; temp6 = temp6.next) {
        if (temp6.key.equals(temp.key)) {
            temp6.count++;
            temp2 = temp6;
            Node parent = first;
            Node parent2 = first;
            while (parent != null) {
                if (parent.key.equals(word)) {
                    if (parent == first) {
                        first = first.next;
                    } else {
                        parent2.next = parent.next;
                    }

                }
                parent2 = parent;
                parent = parent.next;
            }
            //replaces first with temp2 if temp2's count is higher than first's
            if (temp2.count > first.count) {
                Node temp3 = first;
                first = temp2;
                first.next = temp3;
            } 
            //Adds 1 to the counter if the word is already in the linkedlist. Moves the node to the correct place and deletes the original node.
            for (Node temp4 = first.next; temp4 != null; temp4 = temp4.next){
                if(temp4.next.count < first.count){
                    Node temp5 = temp4.next;
                    temp4.next = temp2;
                    temp2.next = temp5;
                    break;
                }
            }
            return;
            }
        }
        current.next = temp;
    }
    current = temp;
}

The approach to your problem seems a bit overly complex at first glance. This may be because your Node class does something that requires a more complex approach. However, I would recommend using a Set . This way you can just create a POJO called Word that contains a String word , and Integer count . If you implement Comparable with this POJO then you can @Override compareTo(Word w) which you can then sort my count. Since a Set won't allow duplicates, you can then create a new Word for each word you read in, or simply increment the count of Word . Once you finish reading the entire file, you then just print out the first 10 objects in the list. Something to illustrate my point would be this example.

class Word implements Comparable<Word>{
    String word;
    Integer count;

    Word(String w, Integer c) {
        this.word = w;
        this.count = c;
    }

    public String toString(){   
        return word + " appeared " + count + " times.";
    }

    @Override
    public int compareTo(Word w) {
        return  this.count - w.count;
    }
}

public class TestTreeMap {
    public static void main(String[] args) {
        //Add logic here for reading in from file and ...
    }
}

Anyway, I hope this answer helps to point you in the right direction. As a side note, I tend to try and find the simplest solution, as the more clever we get the more unmaintainable our code becomes. Good luck!

Here is how we can do it using collection(s)

class WordCount {

    public static void main (String[] are) {
        //this should change. Used to keep it simple
        String sentence = "Returns a key value mapping associated with the least key greater than or   equal to the given key";
        String[] array = sentence.split("\\s");

        //to store the word and their count as we read them from the file
        SortedMap<String, Integer> ht = new TreeMap<String, Integer>();

        for (String s : array) {
            if (ht.size() == 0) {
                ht.put(s, 1);
            } else {
                if (ht.containsKey(s)) {
                    int count = (Integer) ht.get(s);
                    ht.put(s, count + 1);
                } else {
                    ht.put(s, 1);
                }
            }
        }
        //impose reverse of the natural ordering on this map
        SortedMap<Integer, String> ht1 = new TreeMap<Integer, String>(Collections.reverseOrder());

        for (Map.Entry<String, Integer> entrySet : ht.entrySet()) {
            //setting the values as key in this map
            ht1.put(entrySet.getValue(), entrySet.getKey());
        }

        int firstTen = 0;
        for (Map.Entry<Integer, String> entrySet : ht1.entrySet()) {
            if (firstTen == 10) 
                break;
            System.out.println("Word-" + entrySet.getValue() + " number of times-" +   entrySet.getKey());
            firstTen++;
        }
    }
}

there is one problem here...which is if there are two words with same frequency we see only one in the output.

So, I ended up modifying it again as below

class WordCount1 {
    public static void main (String...arg) {
        String sentence = "Returns a key value mapping mapping the mapping key the than or equal to the or key";
        String[] array = sentence.split("\\s");

        Map<String, Integer> hm = new HashMap<String, Integer>();
        ValueComparator vc = new ValueComparator(hm);
        SortedMap<String, Integer> ht = new TreeMap<String, Integer>(vc);

        for (String s : array) {
            if (hm.size() == 0) {
                hm.put(s, 1);
            } else {
                if (hm.containsKey(s)) {
                    int count = (Integer) hm.get(s);
                    hm.put(s, count + 1);
                } else {
                    hm.put(s, 1);
                }
            }
        }

        ht.putAll(hm);

        int firstTen = 0;
        for (Map.Entry<String, Integer> entrySet : ht.entrySet()) {
            if (firstTen == 10) 
                break;
            System.out.println("Word-" + entrySet.getKey() + " number of times-" + entrySet.getValue());
        firstTen++;
    }
}

and, the ValueComparator from here . Tweaked it a little and is as below

public class ValueComparator implements Comparator<String> {
    Map<String, Integer> entry;

    public ValueComparator(Map<String, Integer> entry) {
        this.entry = entry;
    }

    public int compare(String a, String b) {
        //return entry.get(a).compareTo(entry.get(b));
        //return (thisVal<anotherVal ? -1 : (thisVal==anotherVal ? 0 : 1));//from java source
        return (entry.get(a) < entry.get(b) ? 1 : (entry.get(a) == entry.get(b) ? 1 : -1));
    }
}

This program is case sensitive and in case you need case-insensitive behavior - just convert the strings to lowercase before putting into the Map .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM