简体   繁体   English

对于以下用例,可以更快地实现trie?

[英]What can be a faster implementation of trie for the following use case?

I am trying to solve the problem , essentially we need to find all words from dictionary which have the given prefix in lexicographic order. 我正在尝试解决该问题 ,从本质上讲,我们需要从字典中查找所有按词典顺序具有给定前缀的单词。

I am using Trie data structure for the task but my solution just times out on the judge , what can be a more efficient/faster way to solve this problem? 我正在为任务使用Trie数据结构,但是我的解决方案只是判断力不足,什么是解决此问题的更有效/更快的方法?

My current implementation is 我当前的实现是

class trie{
    node root=new node();
    class node{
        node child[]=new node[26];
        boolean is_leaf=false;
    }

    public void add(char c[])
    {
        node root=this.root;
        int pos=0,c1=0;
        while(pos<c.length)
        {
            c1=c[pos]-'a';
            if(root.child[c1]==null)
            {
                root.child[c1]=new node();
            }
            root=root.child[c1];
            pos++;
        }
        root.is_leaf=true;
    }
    public ArrayList<String> search(String s)
    {
        char c[]=s.toCharArray();
        node root=this.root;
        int pos=0,c1=0;
        while(pos<c.length)
        {
            c1=c[pos]-'a';
            if(root.child[c1]==null)
            {
                root.child[c1]=new node();
            }
            root=root.child[c1];
            pos++;
        }
        ArrayList<String> ans=new ArrayList<>();
        build_recursive(root,s,new StringBuilder(),ans);
        return ans;

    }
    public void build_recursive(node root,String prefix,StringBuilder cur, ArrayList<String> ans)
    {
        if(root.is_leaf&&cur.length()!=0)
        {
            String s=prefix+cur.toString();
            ans.add(s);
        }

        for(int i=0;i<26;i++)
        {
            if(root.child[i]!=null)
            {
                char c=(char) (i+'a');
                cur.append(c);
                build_recursive(root.child[i], prefix, cur, ans);
                cur.deleteCharAt(cur.length()-1);

            }
        }
    }

}

The function Search returns the sorted list of all words that share the given prefix. 搜索功能返回共享给定前缀的所有单词的排序列表。

Also is there a better data structure i could use for this? 另外,我是否可以使用更好的数据结构?

Tries are great at finding a substring of another string. 尝试在找到另一个字符串的子字符串方面很棒。 However, you are searching for words in a dictionary - substring matching is not really necessary. 但是,您正在搜索词典中的单词-子字符串匹配实际上不是必需的。 Also, once you find the first word with the prefix, the next word, if it exists, will be right next to it. 同样,一旦找到带有前缀的第一个单词,则下一个单词(如果存在)将紧挨其后。 No complex search required! 无需复杂的搜索!

Tries also carry a lot of overhead from being built out of nodes, which then need to be referenced with pointers (= extra space requirements). 尝试从节点外构建还带来很多开销,然后需要使用指针进行引用(=额外的空间需求)。 Pointers are slow. 指针很慢。 In C++, iterating linked lists can be 20x slower than iterating arrays, unless the nodes are all nicely ordered. 在C ++中,除非所有节点都很好地排序,否则迭代链表的速度可能比迭代数组慢20倍

This problem can, very probably, be solved via 这个问题很可能可以通过解决

  • reading all words into an ArrayList of String: O(n), with n = words 将所有单词读入String:O(n)的ArrayList中,其中n =个单词
  • sorting the ArrayList: O(n log n) 排序ArrayList:O(n log n)
  • and for each prefix query, 对于每个前缀查询,
    • using binary search to find the 1st match for the prefix: O(log n), and it is already implemented in the standard library 使用二进制搜索找到前缀:O(log n)的第一个匹配项,并且它已经在标准库中实现
    • returning consecutive elements that match until matches are exhausted: O(m), m = number of matches 返回匹配的连续元素,直到匹配用尽:O(m),m =匹配数

This is faster than Tries on theoretical complexity, and and a lot faster due to memory layout - messing with pointers, when you don't need to do so, is expensive. 这比Tries的理论复杂度要快,并且由于内存布局的原因要快得多-不需要时弄乱指针是很昂贵的。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM