简体   繁体   中英

How to print out the nth most frequent words in a binary search tree?

This concerns "a software algorithm" https://stackoverflow.com/help/on-topic

I am currently writing a word counter dictionary program. To store the different word counts, I am using a Binary Search Three with the word as the key and the frequency as the value.

Here is my Binary Search Tree class

public class BinarySearchTree<AnyKey extends Comparable<? super AnyKey>, AnyValue>
    implements MyTreeMap<AnyKey, AnyValue>{
              protected BinaryNode<AnyKey, AnyValue> root;
              protected BinaryNode<AnyKey, AnyValue> insert(AnyKey x, 
                      AnyValue y, BinaryNode<AnyKey, AnyValue> t ){
                    if( t == null )
                         t = new BinaryNode<AnyKey, AnyValue>(x, y );
                   else if( x.compareTo( t.element ) < 0 )
                         t.left = insert( x, y, t.left );
                  else if( x.compareTo( t.element ) > 0 )
                        t.right = insert( x, y, t.right );
                 else
                          throw new IllegalArgumentException( x.toString( ) );  
                return t;
      }

And here's my node class

class BinaryNode<AnyKey, AnyValue> {
      BinaryNode( AnyKey theElement, AnyValue theValue ){
          element = theElement;
          value = theValue;
          left = right = null;
       }
       AnyKey             element; 
       AnyValue    value;
        BinaryNode<AnyKey, AnyValue> left;    
       BinaryNode<AnyKey, AnyValue> right;  
     }

I am trying to write this method inside my Binary Search Tree

@Override
public void PrintMostFrequent(int n) {

}

Where it will print out the nth most frequent words based on frequency. I have an idea for how to do this in psuedo code.
1. Create a collection to hold nodes
2. Add all the nodes from the tree to this collection
3. Sort the collection based on counts
4. Iterate sorted collection and print out the nth most frequent.

Is this the best way to solve this problem/write this method? I was afraid that creating a separate collection might be spaciously too expensive and the sorting would be computationally expensive as well.

Your Method describe is also pretty much good . It will be complex when you consider need to added one insert new word into the there one fro inserting into the tree which will take O(logn) and on the sorted list O(n) in worstcase Then for searching again O(n).

For better performance over searching of for nth frequent node and inserting one method would be create one more BST but with frequency . So for inserting a new node in both tree will take O(logn) and for searching O(logn) .

In the above method you have redundancy for data ie 2nd tree will have word and frequency both . So for avoiding that what you can do is in 2nd BST just put frequency and one reference to node of the word in the 1st BST with this you can jump from one tree to another tree any point of time.

A solution would be:

  1. Initialize a TreeSet<Node> result sorted by node word frequency.
  2. Add the first n elements from your tree to the set.
  3. Iterate through the rest of the elements, replacing the lowest value in the set with higher values. if current > result.lowest() then result.pollFirst(); result.add(current)

This has limited spacial cost and should be faster, as most elements can be skipped directly.

Note however, that unless you are dealing with huge arrays and have traced slowdowns to this function, your solution's simplicity makes it the better choice.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM