简体   繁体   中英

Unable to print nodes in a Trie in Python

I have two doubts regarding the below implementation of a trie data structure.

Doubt 1:

I am having a hard time understanding the insert function in a trie. This is the insert word function:

def add(self, word):
    cur = self.head
    for ch in word:
        if ch not in cur:
            cur[ch] = {}
        cur = cur[ch]
    # * denotes the Trie has this word as item
    # if * doesn't exist, Trie doesn't have this word but as a path to longer word
    cur['*'] = True

Why an empty dictionary is initiated after the if statement?

Also, what is the significance of cur = cur[ch] ?

Please help me understand those lines in the if statement in the code.

Doubt 2:

I am trying to print all the nodes present inside the trie, but it is printing as an object like <__main__.Trie object at 0x7f655de1c9e8> . Can someone please help me to print the nodes of the trie?

Below is the code.

class Trie:
    head = {}

    def add(self, word):
        cur = self.head
        for ch in word:
            if ch not in cur:
                cur[ch] = {}
            cur = cur[ch]
        # * denotes the Trie has this word as item
        # if * doesn't exist, Trie doesn't have this word but as a path to longer word
        cur['*'] = True

    def search(self, word):
        cur = self.head
        for ch in word:
            if ch not in cur:
                return False
            cur = cur[ch]

        if '*' in cur:
            return True
        else:
            return False
dictionary = Trie()
dictionary.add("hi")
dictionary.add("hello")
print((dictionary)) # <__main__.Trie object at 0x7f655de1c9e8>

1) The if statement is to check if the given character does not already have its own dictionary at the current depth, then it creates an empty dictionary. The cur = cur[ch] is to increase the depth of cur by 1, in an attempt to find the place to put word

2) To have display the contents of Trie, add a __ str__ method in Trie.

For example:

def __str__(self):
    #code

Doubt 1

Initially, you have a plain old empty dictionary, head = {} . Your first word is "cat" . You create a reference to head called cur . This is necessary since we'll be traversing the structure and would lose our reference to the outermost head if we don't use a temporary variable. Modifications made to cur will reflect on head . We need to add a ch = "c" key to the empty dict as our first letter of "cat" , but unfortunately, this key doesn't exist. So we create it. head / cur now looks like:

head = {"c": {}}
cur = head

Then, the line cur = cur[ch] executes. ch is "c" , so this is the same as cur = cur["c"] . cur has just moved down a level of the trie and we step the for loop to the next character, which is "a" . We're back to the same scenario: cur = {} and we need to add the "a" key, so we do:

head = {"c": {"a": {}}}
cur = head["c"]

cur = cur["a"] runs and the same thing repeats for the next iteration:

head = {"c": {"a": {"t": {}}}}
cur = head["c"]["a"]

Finally the loop ends, we set the flag character "*" we're done adding "cat" . Our result is:

head = {"c": {"a": {"t": {"*": True}}}}

Now, let's call trie.add("cart") . I'll just show the updates:

head = {"c": {"a": {"t": {"*": True}}}}
cur = head
head = {"c": {"a": {"t": {"*": True}}}}
cur = head["c"]
head = {"c": {"a": {"t": {"*": True}}}}
cur = head["c"]["a"]
head = {
    "c": {
        "a": {
            "r": {},
            "t": {"*": True}                
        }
    }
}
cur = head["c"]["a"]["r"]
head = {
    "c": {
        "a": {
            "r": {
                "t": {}
            },
            "t": {"*": True}                 
        }
    }
}
cur = head["c"]["a"]["r"]["t"]

Finally:

head = {
    "c": {
        "a": {
            "r": {
                "t": {"*": True}
            },
            "t": {"*": True}                 
        }
    }
}

We've created an n-ary tree-like data structure (since there are multiple "roots", it's not exactly a tree, but by adding a dummy root node with head 's contents as its children it'd be a legitimate tree).

Hopefully this makes sense. Try adding "car" next and see what happens.


Doubt 2

When you print an object, print tries to call the object's magic __str__ method. If it doesn't exist, it inherits the default __str__ , which simply prints the memory location of the object. This is useful for comparing object references quickly, but if you want to show the object's data, you need to implement it. Probably the easiest way for your purposes is to dump the head dict to string:

import json

class Trie:
    def __str__(self):
        return json.dumps(self.head, sort_keys=True, indent=4)

A bit ironically, had you been able to pretty-print the trie, doubt 1 would be easier to resolve by dumping the structure inside the loop.


Other remarks

  • Give Trie an initializer so that head is not a static variable shared by all instances.

  • Code like

    if '*' in cur: return True else: return False

    is poor style. Simply return '*' in cur .

  • cur['*'] = True is a brittle design that will lead to bugs for words with "*" characters in them. Prefer a key like None that cannot possibly be a single character in a string.

Cleanup

import json

class Trie:
    end_mark = None

    def __init__(self):
        self.head = {}

    def add(self, word):
        cur = self.head

        for ch in word:
            if not ch in cur:
                cur[ch] = {}

            cur = cur[ch]

        cur[Trie.end_mark] = True

    def __contains__(self, word):
        cur = self.head

        for ch in word:
            if ch not in cur:
                return False

            cur = cur[ch]

        return Trie.end_mark in cur

    def __str__(self):
        return json.dumps(self.head, sort_keys=True, indent=4)

if __name__ == "__main__":
    trie = Trie()
    trie.add("cat")
    trie.add("cart")
    print(trie)
    print("cat" in trie)
    print("car" in trie)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM