简体   繁体   English

如何记忆游戏树(一棵潜在的无限玫瑰树)的重复子树?

[英]How to memoize the repeated subtrees of a game tree (a potentially infinite rose tree)?

I am attempting to implement the Negamax algorithm in Haskell. 我试图在Haskell中实现Negamax算法。

For this, I am representing the future possibilities a game might take in a rose tree ( Data.Tree.Forest (depth, move, position) ). 为此,我代表了游戏在玫瑰树中可能采用的未来可能性( Data.Tree.Forest (depth, move, position) )。 However, often there are positions that can be reached with two different sequences of moves. 然而,通常有两个不同的移动顺序可以达到的位置。 It is a waste (and quickly becomes very slow) to re-evaluate (the subtrees of) repeated positions. 重新评估(子树)重复的位置是浪费(并且很快变得很慢)。

Here is what I tried so far: 这是我到目前为止尝试的内容:

  • Implement a variant of Tying the Knot to share common sub-results. 实现绑定结的变体以共享公共子结果。 However, I have only been able to find explanations of tying the knot for (potentially infinite) lists, and nothing about re-using subtrees. 但是,我只能找到为(可能是无限的)列表打结的解释,而不是重新使用子树。

  • Another approach I have considered was to build a tree inside the State monad, where the state to keep would be a Map (depth, position) (Forest (depth, move, position)) to perform explicit memoization but I have so far not been able to set this up properly either. 我考虑的另一种方法是在State monad中构建一个树,其中要保留的状态是Map (depth, position) (Forest (depth, move, position))以执行显式的memoization但我到目前为止还没有能够正确设置它。

I think that both approaches might have the problem that a game tree can only be built in a corecursive way: We do not build the tree up to the root from the leaves, but build a (potentially infinite) tree lazily from the root down. 我认为这两种方法都可能存在这样的问题:游戏树只能以一种核心的方式构建:我们不会从树上建立树到根,而是从根向下懒惰地构建一个(可能是无限的)树。


EDIT: To give you an example of the code I am currently using (that is too slow): 编辑:给你一个我目前使用的代码的例子(这太慢了):

{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE FlexibleContexts #-}
module ZeroSumGame where

import qualified Control.Arrow
import Data.Tree

import Numeric.Natural (Natural)

(|>) :: a -> (a -> b) -> b
x |> f = f x
infixl 0 |>
{-# INLINE (|>) #-}

class Ord s => Game s where
  data Move s
  initial :: s -- | Beginning of the game
  applyMove :: Natural -> s -> Move s -> s -- | Moving from one game state to the next
  possibleMoves :: Natural -> s -> [Move s] -- | Lists moves the current player is able to do.
  isGameOver :: s -> Bool -- | True if the game has ended. TODO: Maybe write default implementation using `possibleMoves state == []`?
  scorePosition :: Natural -> Move s -> s -> Int -- | Turns a position in an integer, for the Negamax algorithm to decide which position is the best.

type Trimove state = (Natural, Move state, state) -- | Depth since start of game, move to next position, new position

gameforest :: Game s => Natural -> s -> Forest (Trimove s)
gameforest start_depth start_state = unfoldForest buildNode (nextpositions start_depth start_state)
  where
    buildNode (depth, move, current_state) =
      if
        isGameOver current_state
      then
        ((depth, move, current_state), [])
      else
        ((depth, move, current_state), nextpositions depth current_state)
    nextpositions depth current_state =
      current_state
      |> possibleMoves depth
      |> fmap (\move -> (succ depth, move, applyMove depth current_state move))

scoreTree :: Game s => Ord (Move s) => Natural -> Tree (Trimove s) -> (Move s, Int)
scoreTree depth node =
  case (depth, subForest node) of
    (0, _) ->
      node |> rootLabel |> uncurry3dropFirst scorePosition
    (_, []) ->
      node |> rootLabel |> uncurry3dropFirst scorePosition
    (_, children) ->
      children
      |> scoreForest (pred depth)
      |> map (Control.Arrow.second negate)
      |> maximum

uncurry3dropFirst :: (a -> b -> c -> d) -> (a, b, c) -> (b, d)
uncurry3dropFirst fun (a, b, c) = (b, fun a b c)

scoreForest :: Game s => Ord (Move s) => Natural -> Forest (Trimove s) -> [(Move s, Int)]
scoreForest depth forest =
  forest
  |> fmap (scoreTree depth)

{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE FlexibleContexts #-}
module TicTacToe where

import qualified Control.Monad.State
import Control.Monad.State (State)
import qualified Data.Map
import Data.Map (Map)
import qualified Control.Arrow
import Data.Tree

import Data.Array (Array)
import qualified Data.Array
import qualified Data.Maybe
import qualified Data.Foldable

import Numeric.Natural (Natural)


import ZeroSumGame

data CurrentPlayer = First | Second
  deriving (Eq, Ord, Show)


instance Enum CurrentPlayer where
  fromEnum First = 1
  fromEnum Second = -1
  toEnum 1 = First
  toEnum (-1) = Second
  toEnum _ = error "Improper player"

newtype TicTacToe = TicTacToe (Array (Int, Int) (Maybe CurrentPlayer))
  deriving (Eq, Ord)

instance Game TicTacToe where
  data Move TicTacToe = TicTacToeMove (Int, Int)
    deriving (Eq, Ord, Show, Bounded)

  initial = TicTacToe initialTicTacToeBoard

  possibleMoves _depth = possibleTicTacToeMoves

  applyMove depth (TicTacToe board) (TicTacToeMove (x, y)) =
    TicTacToe newboard
    where
      newboard = board Data.Array.// [((x, y), Just player)]
      player = case depth `mod` 2 of
        0 -> First
        _ -> Second

  isGameOver state = Data.Maybe.isJust (findFilledLines state)

  scorePosition _ _ state =
          state
          |> findFilledLines
          |> fmap fromEnum
          |> Data.Maybe.fromMaybe 0
          |> (* (-10000))



findFilledLines :: TicTacToe -> Maybe CurrentPlayer
findFilledLines (TicTacToe board) =
  (rows ++ columns ++ diagonals)
  |> map winner
  |> Data.Foldable.asum
  where
    rows = vals rows_indexes
    columns = vals columns_indexes
    diagonals = vals diagonals_indexes
    rows_indexes = [[(i, j) | i <- [0..2]]| j <- [0..2]]
    columns_indexes = [[(i, j) | j <- [0..2]]| i <- [0..2]]
    diagonals_indexes = [[(i, i) ]| i <- [0..2]] ++ [[(i, 2 - i) ]| i <- [0..2]]
    vals = map (map (\index -> board Data.Array.! index))

winner :: Eq a => [Maybe a] -> Maybe a
winner [x,y,z] =
  if x == y && x == z then x else Nothing
winner _ = Nothing


initialTicTacToeBoard :: (Array (Int, Int) (Maybe CurrentPlayer))
initialTicTacToeBoard =
  Data.Array.array ((0, 0), (2, 2)) [((i, j), Nothing) | i <- [0..2], j <- [0..2]]

possibleTicTacToeMoves :: TicTacToe -> [Move TicTacToe]
possibleTicTacToeMoves (TicTacToe board) = foldr checkSquareForMove [] (Data.Array.assocs board)
    where
      checkSquareForMove (index, val) acc = case val of
        Nothing -> TicTacToeMove index : acc
        Just _ -> acc

printBoard :: TicTacToe -> String
printBoard (TicTacToe board) =
  unlines [unwords [showTile (board Data.Array.! (y, x)) | x <- [0..2]] |  y <- [0..2]]
  where
    showTile loc =
      case loc of
        Nothing -> " "
        Just Second -> "X"
        Just First -> "O"

(TypeFamilies is used to allow each Game implementation to have their own notion of a Move , and FlexibleContexts is then required to enforce Move s to implement Ord . (TypeFamilies用于允许每个Game实现拥有自己的Move概念,然后需要FlexibleContexts来强制执行Move s以实现Ord

Problem reformulation 问题重新制定

If I understand the question correctly, you have a function that returns the possible next moves in a game, and one to take that move: 如果我正确地理解了这个问题,那么你有一个函数可以返回游戏中可能的下一个动作,还有一个动作:

start :: Position
moves :: Position -> [Move]
act :: Position -> Move -> Position

and how you want to build the infinite tree of states (please allow me to ignore the Depth field, for simplicity. If you consider the depth counter as part of the Position type, you see that no generality is lost here): 以及如何构建无限状态树(请允许我忽略Depth字段,为简单起见。如果您将深度计数器视为Position类型的一部分,您会发现此处没有丢失一般性):

states :: Forest (Position, Move)
states = forest start

forest :: Position -> Forest (Position, Move)
forest p = [ Node (m, p') (states p') | m <- moves p, let p' = act p m ]

but you want to achieve that in a way that identical subtrees of forest are shared. 但是你想以一种相同的forest树形共享方式实现这一目标。

Towards Memoization 走向记忆

The general technique is here is that we want to memoize forest : This way, for identical Positions , we get shared subtrees. 一般的技术是我们想要记忆forest :这样,对于相同的Positions ,我们得到共享的子树。 So the recipe is: 所以食谱是:

forest :: Position -> Forest (Position, Move)
forest = memo forest'

forest' :: Position -> Forest (Position, Move)
forest' p = [ Node (m, p') (states p') | m <- moves p, let p' = act p m ]

And we need a suitable memo-function: 我们需要一个合适的备忘录功能:

memo :: (Position -> a) -> (Position -> a)

At this point, we need to know more about Position in order to know how to implement that using an equivalent of the “lazy list” trick… But you see that you do not need to memoize functions that involve Rose trees. 在这一点上,我们需要更多地了解Position ,以便知道如何实现,在基于“懒列表”伎俩的等效......但是你看,你不需要 memoize的涉及玫瑰树功能。

I would try to do this by normalizing board positions based on some "canonical" sequence of moves to reach that position. 我会尝试通过基于一些“规范”动作序列来规范棋盘位置以达到该位置。 Then each child is assigned the value of traversing its individual normalized sequence through the tree. 然后为每个子项分配通过树遍历其各个标准化序列的值。 (no code because I'm on my phone and this is a big task.) (没有代码,因为我在手机上,这是一项艰巨的任务。)

How well this works depends on the ease of calculating normalized move sequences in the game you're playing. 它的工作原理取决于您正在玩游戏中计算标准化移动序列的难易程度。 But it's a way to introduce sharing by tying the knot, making use of a shared reference to the root of the game tree. 但它是一种通过打结来引入共享的方法,利用对游戏树根的共享引用。 Maybe it will serve as inspiration for other ideas that fit your specific case. 也许它将成为适合您特定案例的其他想法的灵感来源。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM