簡體   English   中英

我在 Python 中的 Tic Tac Toe 的極小極大算法顯示最大遞歸錯誤

[英]My minimax algorithm for Tic Tac Toe in Python is showing maximum recursion error

我正在嘗試自己在 python 中編寫用於井字游戲的 minimax 算法的代碼,我已經編寫了代碼,但是每當調用該函數時,它都會顯示“比較最大遞歸深度”錯誤。 我被困在這部分。 當我嘗試調試它時,它也無濟於事。

import sys

marked=['','','','','','','','','']
markingSignal=[False,False,False,False,False,False,False,False,False]


def printTable():
    print("\t%s|\t%s|\t%s\n------------------------\n\t%s|\t%s|\t%s\n------------------------\n\t%s|\t%s|\t%s\n"%(marked[0],marked[1],marked[2],marked[3],marked[4],marked[5],marked[6],marked[7],marked[8]))

def winning(m,player):
    i=0
    x=0
    while x<3:
        if m[i]==player and m[i+1]==player and m[i+2]==player:
            return True
        x=x+1
        i=i+3    
    x=0
    i=0
    while x<3:
        if m[2]==player and m[4]==player and m[6]==player:
            return True
        x=x+1
        i=i+3  
    x=0
    i=0
    if m[0]==player and m[4]==player and m[8]==player:
        return True
    if m[2]==player and m[4]==player and m[6]==player:
        return True
    return False         


def minimax(table,marktab,points,pos=0):
    copyTab=table
    copymark=marktab
    remaining=0
    for x in table:
        if x==False:
            remaining=remaining+1
    if remaining==0:
        return points,pos
    scores=[None]*remaining
    positions=[None]*remaining
    z=0
    maximum=0
    bestpos=0
    previous=88
    x=0
    while x<9:
        if table[x]==False:
            if points%2==0:
                copyTab[x]==True
                copymark[x]=='O'
                result=winning(copymark,'O')
                previous=x
                if result:
                    return points ,x
            else:
                copyTab[x]==True
                copymark[x]=='X'    
            scores[z],positions[z]=minimax(copyTab,copymark,points+1,previous)
            z=z+1
            copyTab[x]==False
            copymark[x]==''
        x=x+1
    for x in range(0,len(scores)):
        if x==0:
            maximum=scores[x]
            bestpos=positions[x]
        if scores[x]<maximum:
            maximum=scores[x]
            bestpos=positions[x]
    return maximum, bestpos        



def takeInput(player):
    filled=False
    while filled==False:
        print("Enter Your Choice 1-9")
        x=int(input())
        if x>9:
            print("Invalid Choice")
            continue
        if markingSignal[x-1]:
            print("This slot is already filled")
            continue
        filled=True    
    marked[x-1]=player
    markingSignal[x-1]=True


def main():

    sys.setrecursionlimit(5000)
    print(sys.getrecursionlimit())
    printTable()
    count=0
    player='X'
    while count<9:

        if count%2==0:
            player='X'
            takeInput(player)
        else:
            player='O'  
            p,choice=minimax(markingSignal,marked,0)  
            marked[choice]=player
            markingSignal[choice]=True         
        printTable()
        result=winning(marked,player)
        if result:
            print("\n%s WON !!!\n"%(player))
            break
        count=count+1


main()  

在這段代碼中,用戶輸入部分工作正常,但計算機輸入或極小極大算法部分不工作,並顯示遞歸錯誤

所以,在你的代碼中

scores[z],positions[z]=minimax(copyTab,copymark,points+1,previous)

這是進入一個永無止境的循環。 它一遍又一遍地突破......之前的值總是在 88 和 0 之間。那個遞歸函數必須在某個點返回(在調用遞歸函數之前你只有一個返回,那里是一個獲勝的位置。在第一次移動之后,你不能有獲勝的位置,因此遞歸永遠不會結束)。

在 minimax 函數中考慮到這一點,您不會復制值,僅通過引用傳遞:

copyTab=table.copy()
copymark=marktab.copy()

此外,您沒有增加 X 值,因為在遞歸函數中,電路板未更新且未測試。

所以你需要分配值: copyTab[x]=True copymark[x]='O' 並且不使用 double equals ==只會返回一個布爾值。

所以該功能現在按預期工作:

def minimax(table,marktab,points,pos=0):
    copyTab=table.copy()
    copymark=marktab.copy()
    remaining=0
    for x in table:
        if x==False:
            remaining=remaining+1
    if remaining==0:
        return points,pos
    scores=[None]*remaining
    positions=[None]*remaining
    z=0
    maximum=0
    bestpos=0
    previous=88
    x=0
    while x<9:
        if table[x]==False:
            if points%2==0:
                copyTab[x]=True
                copymark[x]='O'
                result=winning(copymark,'O')
                previous=x
                if result:
                    return points ,x
            else:
                copyTab[x]=True
                copymark[x]='X' 
            scores[z],positions[z]=minimax(copyTab,copymark,points+1,previous)
            z=z+1
            copyTab[x]=False
            copymark[x]=''
        x=x+1
    for x in range(0,len(scores)):
        if x==0:
            maximum=scores[x]
            bestpos=positions[x]
        if scores[x]<maximum:
            maximum=scores[x]
            bestpos=positions[x]
    return maximum, bestpos

另一個答案想提供幫助,但實際上您不需要這些副本。 您正在應用的是一個 do-undo 模式,因此您執行一個步驟,檢查結果並撤消該步驟。 這可以在不復制表的情況下完成,但也必須在從循環內部返回之前完成。 此外,當然需要解決== =錯誤

def minimax(table,marktab,points,pos=0):
    #copyTab=table                             # copyTab eliminated
    #copymark=marktab                          # copymark eliminated
    remaining=0
    for x in table:                            # note that this...
        if x==False:
            remaining=remaining+1
    if remaining==0:
        return points,pos
    scores=[None]*remaining
    positions=[None]*remaining
    z=0
    maximum=0
    bestpos=0
    previous=88
    x=0
    while x<9:
        if table[x]==False:                    # ... and this line were referring to table anyway
            if points%2==0:
                table[x]=True                  # now it is table and =
                marktab[x]='O'                 # marktab and =
                result=winning(marktab,'O')
                previous=x
                if result:
                    table[x]=False             # this ...
                    marktab[x]=''              # ... and this undo steps were missing
                    return points ,x
            else:
                table[x]=True                  # table and =
                marktab[x]='X'                 # marktab and =
            scores[z],positions[z]=minimax(table,marktab,points+1,previous) # table and marktab
            z=z+1
            table[x]=False                     # table and =
            marktab[x]=''                      # marktab and =
        x=x+1
    for x in range(0,len(scores)):
        if x==0:
            maximum=scores[x]
            bestpos=positions[x]
        if scores[x]<maximum:
            maximum=scores[x]
            bestpos=positions[x]
    return maximum, bestpos        

然后對手高興地輸了,就像其他解決方案一樣。

旁注

  • 標記和標記信號可以使用復制,所以marked = ['']*9markingSignal = [False]*9
  • % -format 期望右側有一個元組,因此您可以簡單地編寫% tuple(marked)而不是長% (marked[0],...) % tuple(marked)
  • 去掉copyTabcopymarktablemarktab就不需要作為參數傳遞了
  • markingSignal都不需要markingSignal信號,檢查table[x]==''可以判斷一個字段是空閑還是被占用

這解決了遞歸問題,但對算法沒有幫助。 Wikipedia上查看偽代碼的樣子:

function minimax(node, depth, maximizingPlayer) is
    if depth = 0 or node is a terminal node then
        return the heuristic value of node
    if maximizingPlayer then
        value := −∞
        for each child of node do
            value := max(value, minimax(child, depth − 1, FALSE))
        return value
    else (* minimizing player *)
        value := +∞
        for each child of node do
            value := min(value, minimax(child, depth − 1, TRUE))
        return value

在您的代碼中只有一個最大值查找,我們稱之為max(scores) 您還需要在某處使用min(scores) ,具體取決於當前考慮的玩家,或者您可以應用常用的“技巧”,即min(scores)可以作為查找max(-scores)來完成,但是這樣的“翻轉”也不存在於代碼中。

正如您所說,您想自己修復它,我只提供包含建議簡化的縮短版本,但其他方面完好無損(因此它會毫不猶豫地丟失):

import sys

marked=[''] * 9

def printTable():
    print("\t%s|\t%s|\t%s\n------------------------\n\t%s|\t%s|\t%s\n------------------------\n\t%s|\t%s|\t%s\n"%tuple(marked))

def winning(player):
    i=0
    x=0
    while x<3:
        if marked[i]==player and marked[i+1]==player and marked[i+2]==player:
            return True
        x=x+1
        i=i+3    
    x=0
    i=0
    while x<3:
        if marked[2]==player and marked[4]==player and marked[6]==player:
            return True
        x=x+1
        i=i+3  
    x=0
    i=0
    if marked[0]==player and marked[4]==player and marked[8]==player:
        return True
    if marked[2]==player and marked[4]==player and marked[6]==player:
        return True
    return False         

def minimax(points,pos=0):
    remaining=0
    for x in marked:
        if x=='':
            remaining=remaining+1
    if remaining==0:
        return points,pos
    scores=[None]*remaining
    positions=[None]*remaining
    z=0
    maximum=0
    bestpos=0
    previous=88
    x=0
    while x<9:
        if marked[x]=='':
            if points%2==0:
                marked[x]='O'
                result=winning('O')
                previous=x
                if result:
                    marked[x]=''
                    return points ,x
            else:
                marked[x]='X'    
            scores[z],positions[z]=minimax(points+1,previous)
            z=z+1
            marked[x]=''
        x=x+1
    for x in range(0,len(scores)):
        if x==0:
            maximum=scores[x]
            bestpos=positions[x]
        if scores[x]<maximum:
            maximum=scores[x]
            bestpos=positions[x]
    return maximum, bestpos        

def takeInput(player):
    filled=False
    while filled==False:
        print("Enter Your Choice 1-9")
        x=int(input())
        if x>9:
            print("Invalid Choice")
            continue
        if marked[x-1]!='':
            print("This slot is already filled")
            continue
        filled=True    
    marked[x-1]=player

def main():
    printTable()
    count=0
    player='X'
    while count<9:
        if count%2==0:
            player='X'
            takeInput(player)
        else:
            player='O'  
            p,choice=minimax(0)  
            marked[choice]=player
        printTable()
        result=winning(player)
        if result:
            print("\n%s WON !!!\n"%(player))
            break
        count=count+1

main()

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM