简体   繁体   中英

Implementation of a Trie in Python

I programmed a Trie as a class in python. The search and insert function are clear, but now i tried to programm the python function __str__ , that i can print it on the screen. But my function doesn't work!

class Trie(object):
    def __init__(self):
      self.children = {}
      self.val = None

    def __str__(self):
      s = ''
      if self.children == {}: return ' | '
      for i in self.children:
         s = s + i + self.children[i].__str__()
      return s

    def insert(self, key, val):
      if not key:
         self.val = val
         return
      elif key[0] not in self.children:
         self.children[key[0]] = Trie()
      self.children[key[0]].insert(key[1:], val)

Now if I create a Object of Trie:

tr = Trie()
tr.insert('hallo', 54)
tr.insert('hello', 69)
tr.insert('hellas', 99)

And when i now print the Trie, occures the problem that the entries hello and hellas aren't completely.

print tr
hallo | ellas | o 

How can i solve that problem?.

Why not have str actually dump out the data in the format that it is stored:

def __str__(self):
    if self.children == {}:
        s = str(self.val)
    else:
        s = '{'
        comma = False
        for i in self.children:
            if comma:
                s = s + ','
            else:
                comma = True
            s = s + "'" + i + "':" + self.children[i].__str__()
        s = s + '}'
    return s

Which results in:

{'h':{'a':{'l':{'l':{'o':54}}},'e':{'l':{'l':{'a':{'s':99},'o':69}}}}}

There are several issues you're running into. The first is that if you have several children at the same level, you'll only be prefixing one of them with the initial part of the string, and just showing the suffix of the others. Another issue is that you're only showing leaf nodes, even though you can have terminal values that are not at a leaf (consider what happens when you use both "foo" and "foobar" as keys into a Trie). Finally, you're not outputting the values at all.

To solve the first issue, I suggest using a recursive generator that does the traversal of the Trie. Separating the traversal from __str__ makes things easier since the generator can simply yield each value we come across, rather than needing to build up a string as we go. The __str__ method can assemble the final result easily using str.join .

For the second issue, you should yield the current node's key and value whenever self.val is not None , rather than only at leaf nodes. As long as you don't have any way to remove values, all leaf nodes will have a value, but we don't actually need any special casing to detect that.

And for the final issue, I suggest using string formatting to make a key:value pair. (I suppose you can skip this if you really don't need the values.)

Here's some code:

def traverse(self, prefix=""):
    if self.val is not None:
        yield "{}:{}".format(prefix, self.val)
    for letter, child in self.children.items():
        yield from child.traverse(prefix + letter)

def __str__(self):
    return " | ".join(self.traverse())

If you're using a version of Python before 3.3, you'll need to replace the yield from statement with an explicit loop to yield the items from the recursive calls:

for item in child.traverse(prefix + letter)
    yield item

Example output:

>>> t = Trie()
>>> t.insert("foo", 5)
>>> t.insert("bar", 10)
>>> t.insert("foobar", 100)
>>> str(t)
'bar:10 | foo:5 | foobar:100'

You could go with a simpler representation that just provides a summary of what the structure contains:

class Trie:

    def __init__(self):
        self.__final = False
        self.__nodes = {}

    def __repr__(self):
        return 'Trie<len={}, final={}>'.format(len(self), self.__final)

    def __getstate__(self):
        return self.__final, self.__nodes

    def __setstate__(self, state):
        self.__final, self.__nodes = state

    def __len__(self):
        return len(self.__nodes)

    def __bool__(self):
        return self.__final

    def __contains__(self, array):
        try:
            return self[array]
        except KeyError:
            return False

    def __iter__(self):
        yield self
        for node in self.__nodes.values():
            yield from node

    def __getitem__(self, array):
        return self.__get(array, False)

    def create(self, array):
        self.__get(array, True).__final = True

    def read(self):
        yield from self.__read([])

    def update(self, array):
        self[array].__final = True

    def delete(self, array):
        self[array].__final = False

    def prune(self):
        for key, value in tuple(self.__nodes.items()):
            if not value.prune():
                del self.__nodes[key]
        if not len(self):
            self.delete([])
        return self

    def __get(self, array, create):
        if array:
            head, *tail = array
            if create and head not in self.__nodes:
                self.__nodes[head] = Trie()
            return self.__nodes[head].__get(tail, create)
        return self

    def __read(self, name):
        if self.__final:
            yield name
        for key, value in self.__nodes.items():
            yield from value.__read(name + [key])

Instead of your current strategy for printing, I suggest the following strategy instead:

Keep a list of all characters in order that you have traversed so far. When descending to one of your children, push its character on the end of its list. When returning, pop the end character off of the list. When you are at a leaf node, print the contents of the list as a string.

So say you have a trie built out of hello and hellas . This means that as you descend to hello, you build a list h, e, l, l, o, and at the leaf node you print hello, return once to get (hell), push a, s and at the next leaf you print hellas. This way you re-print letters earlier in the tree rather than having no memory of what they were and missing them.

(Another possiblity is to just descend the tree, and whenever you reach a leaf node go to your parent, your parent's parent, your parent's parent's parent... etc, keeping track of what letters you encounter, reversing the list you make and printing that out. But it may be less efficient.)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM