简体   繁体   English

根据 trie search 查找前缀 | | python

[英]Find a prefix based on trie search | | python

Problem: Given a list of business_names (strings) and a searchTerm (string).问题:给定一个business_names(字符串)列表和一个searchTerm(字符串)。 Return a list of business_names that contains searchTerm as prefix in the business_names.返回一个business_names 列表,其中包含searchTerm 作为business_names 中的前缀。

Example 1.
Input:

business_names[] = { "burger king", "McDonald's", "super duper burger's", "subway", "pizza hut"}
searchTerm = "bur"

Ouput:
["burger king", "super duper burger's"]

I have tried solving in this below way.我试过用下面的方法解决。

But I want to implement trie approach to solve this problem.但我想实施 trie 方法来解决这个问题。 some one please help here?有人请在这里帮忙吗? https://www.geeksforgeeks.org/trie-insert-and-search/ https://www.geeksforgeeks.org/trie-insert-and-search/

Any linear solution to solve任何要解决的线性解决方案

def prefix(business_names, searchTerm):
    split = [i.split() for i in business_names]
    ans = []
    for name in split:
        for i in range(len(name)):
            query = ' '.join(name[i:])
            if query.startswith(searchTerm):
                ans.append(name)
                break
    return [' '.join(i) for i in ans]

I don't know what means trie approach and if this is what you need but I would write it simpler - without join , range , len我不知道trie approach是什么意思,如果这是你需要的,但我会写得更简单——没有joinrangelen

To make sure I use also lower()为了确保我也使用lower()

business_names = ["burger king", "McDonald's", "super duper burger's", "subway", "pizza hut"]
searchTerm = "bur"


def prefix(business_names, searchTerm):
    searchTerm = searchTerm.lower()

    results  = []
    for name in business_names:
        for word in name.split(' '):
            word = word.lower()
            if word.startswith(searchTerm):
                results.append(name)
                break

    return results
    
print(prefix(business_names, searchTerm))

EDIT:编辑:

I took code from link and create this code.我从链接中获取代码并创建此代码。

But I had to change two things.但我必须改变两件事。

  • it works only with letters az so I have to remove ' and convert to lower()它只适用于字母az所以我必须删除'并转换为lower()

  • it search only full words but after removing and pCrawl.isEndOfWord in return pCrawl.= None and pCrawl.isEndOfWord it seems it find words which stats with searchTerm它只搜索完整的单词,但在删除and pCrawl.isEndOfWordreturn pCrawl.= None and pCrawl.isEndOfWord ,它似乎找到了使用searchTerm统计的单词

But I have one doubt: maybe it can search better then O(n^2) but first it has to build Trie and it also need some time.但我有一个疑问:也许它可以比O(n^2)搜索得更好,但首先它必须构建Trie ,而且还需要一些时间。 So it can be useful when you always search in the same text and you have to build Trie only once.因此,当您总是在相同的文本中搜索并且您只需要构建一次Trie时,它会很有用。 But you have to build separated Trie for every business name - and it doesn't have to be faster.但是您必须为每个企业名称构建单独的Trie - 而且它不必更快。

. .

class TrieNode: 
      
    # Trie node class 
    def __init__(self): 
        self.children = [None]*26
  
        # isEndOfWord is True if node represent the end of the word 
        self.isEndOfWord = False
  
class Trie: 
      
    # Trie data structure class 
    def __init__(self): 
        self.root = self.getNode() 
  
    def getNode(self): 
      
        # Returns new trie node (initialized to NULLs) 
        return TrieNode() 
  
    def _charToIndex(self,ch): 
          
        # private helper function 
        # Converts key current character into index 
        # use only 'a' through 'z' and lower case 
          
        return ord(ch)-ord('a') 
  
  
    def insert(self, key): 
          
        # If not present, inserts key into trie 
        # If the key is prefix of trie node,  
        # just marks leaf node 
        pCrawl = self.root 
        length = len(key) 
        for level in range(length): 
            index = self._charToIndex(key[level]) 
  
            # if current character is not present 
            if not pCrawl.children[index]: 
                pCrawl.children[index] = self.getNode() 
            pCrawl = pCrawl.children[index] 
  
        # mark last node as leaf 
        pCrawl.isEndOfWord = True
  
    def search(self, key): 
          
        # Search key in the trie 
        # Returns true if key presents  
        # in trie, else false 
        pCrawl = self.root 
        length = len(key) 
        for level in range(length): 
            index = self._charToIndex(key[level]) 
            if not pCrawl.children[index]: 
                return False
            pCrawl = pCrawl.children[index] 
  
        return pCrawl != None #and pCrawl.isEndOfWord  # <-- check `isEndOfWord` to search full words
  
# driver function 

def prefix(business_names, searchTerm):
  
    searchTerm = searchTerm.lower()

    results  = []
    
    for name in business_names:
    
        # Input keys (use only 'a' through 'z' and lower case) 
        # remove `'`  and convert to list with lower case words
        keys = name.lower().replace("'", "").split(" ")
        #print('keys:', keys)
        
        # Trie object 
        t = Trie() 
  
        # Construct trie 
        for key in keys: 
            #print('key:', key)
            t.insert(key) 

        # Search in trie
        if t.search(searchTerm) is True:
            results.append(name)        
        
    return results
  
if __name__ == '__main__': 

    business_names = ["burger king", "McDonald's", "super duper burger's", "subway", "pizza hut"]
    searchTerm = "bur"
    
    results = prefix(business_names, searchTerm)

    print( results )

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM