簡體   English   中英

二進制搜索樹頻率計數器

[英]Binary Search Tree Frequency Counter

我需要閱讀一個文本文件,去除不必要的標點符號,將單詞小寫,然后使用二進制搜索樹功能來制作由文件中的單詞組成的單詞二進制搜索樹。

我們被要求計算重復單詞的頻率,並要求總單詞數和總唯一單詞數。

到目前為止,我已經解決了標點符號,完成了文件讀取,完成了小寫字母,基本完成了二進制搜索樹的工作,我只需要弄清楚如何在代碼中實現“頻率”計數器即可。

我的代碼如下:

class BSearchTree :
class _Node :
    def __init__(self, word, left = None, right = None) :
        self._word = word
        self._count = 0
        self._left = left
        self._right = right

def __init__(self) :
    self._root = None
    self._wordc = 0
    self._each = 0

def isEmpty(self) :
    return self._root == None


def search(self, word) :
    probe = self._root
    while (probe != None) :
        if word == probe._word :
            return probe
        if word < probe._value :
            probe = probe._left
        else : 
            probe = probe._right
    return None     

def insert(self, word) :
    if self.isEmpty() :
        self._root = self._Node(word)
        self._root._freq += 1 <- is this correct?
        return

    parent = None               #to keep track of parent
                                #we need above information to adjust 
                                #link of parent of new node later

    probe = self._root
    while (probe != None) :
        if word < probe._word :     # go to left tree
            parent = probe          # before we go to child, save parent
            probe = probe._left
        elif word > probe._word :   # go to right tree
            parent = probe          # before we go to child, save parent
            probe = probe._right

    if (word < parent._word) :      #new value will be new left child
        parent._left = self._Node(word)
    else :    #new value will be new right child
        parent._right = self._Node(word)

原因格式化殺死了我,這是它的后半部分。

class NotPresent(Exception) :
pass

def main():
t=BST()

file = open("sample.txt")           
line = file.readline()                      
file.close()                            


#for word in line:
#   t.insert(word)
# Line above crashes program because there are too many 
# words to add. Lines on bottom tests BST class
t.insert('all')
t.insert('high')
t.insert('fly')
t.insert('can')
t.insert('boars')
#t.insert('all') <- how do i handle duplicates by making 
t.inOrder()        #extras add to the nodes frequency?

感謝您的幫助/嘗試提供幫助!

首先,最好用1初始化Node_freq ,而不是在BSTinsert()進行初始化

(另外1個:在python編碼約定中,不建議在寫入默認參數值時使用空格。)

    def __init__(self, word, left=None, right=None) :
        self._word = word
        self._freq = 1
        self._left = left
        self._right = right

並添加最后三行:

    probe = self._root
    while (probe != None) :
        if word < probe._word :     # go to left tree
            parent = probe          # before we go to child, save parent
            probe = probe._left
        elif word > probe._word :   # go to right tree
            parent = probe          # before we go to child, save parent
            probe = probe._right
        else:
            probe._freq += 1
            return

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM