简体   繁体   English

适用于Android黑白棋游戏的Minimax / Alpha Beta

[英]Minimax / Alpha Beta for Android Reversi Game

I have to implement a Reversi game for Android. 我必须为Android实现一个黑白棋游戏。 I have managed to implement all the game, is functional, but the problem is that I don't have an AI. 我设法实现了所有游戏,可以正常运行,但是问题是我没有AI。 In fact, at every move the computer moves in the position that achieves him the highest number of pieces. 实际上,计算机的一举一动都在使他达到最大数量的位置上进行。

I decided to implement and alpha-beta pruning algorithm. 我决定实现和alpha-beta修剪算法。 I did a lot of research on the internet about it, but I couldn't come to a final conclusion how to do it. 我在互联网上对此进行了大量研究,但无法得出最终结论。 I tried to implement a few functions, but I couldn't achieve the desired behaviour. 我尝试实现一些功能,但无法实现所需的行为。

My board is stored in class Board (inside this class, the pieces occupied by each player are stored in a bi-dimensional int array). 我的棋盘存储在Board类中(在该类中,每个玩家占用的棋子都存储在二维int数组中)。 I have attached an small diagram (sorry about the way it looks). 我附了一张小图(对不起它的外观)。

DIAGRAM: https://docs.google.com/file/d/0Bzv8B0L32Z8lSUhKNjdXaWsza0E/edit 图: https : //docs.google.com/file/d/0Bzv8B0L32Z8lSUhKNjdXaWsza0E/edit

I need help to figure out how to use the minimax algorithm with my implementation. 我需要帮助找出如何在实现中使用minimax算法。

What I understood so far, is that I have to make an evaluation function regarding the value of the board. 到目前为止,我所了解的是我必须对董事会的价值进行评估。

To calculate the value of the board I have to account the following elements: -free corners (my question is that I have to take care only about the free corners, or the one that I can take at the current move?! dilemma here). 要计算棋盘的价值,我必须考虑以下因素:-自由角球(我的问题是我只需要关心自由角球,或者我可以在当前举动中采取的自由角球?!这里的困境) 。 -mobility of the board: to check the number of pieces that will be available to move, after the current move. -棋盘的移动性:在当前移动之后检查可移动的棋子数。 -stability of the board… I know it means the number of pieces that can't be flipped on the board. -板子的稳定性...我知道这意味着板子上不能翻转的件数。 -the number of pieces the move will offer me -此举将为我提供的件数

I have in plan to implement a new Class BoardAI that will take as an argument my Board object and the dept. 我已计划实施新的Class BoardAI,它将把我的Board对象和部门作为参数。

Can you please tell me a logical flow of ideas how I should implement this AI? 能否请您告诉我我应该如何实现此AI的逻辑思路? I need some help about the recursion while calculating in dept and I don't understand how it calculates the best choice. 在dept中进行计算时,我需要一些有关递归的帮助,但我不知道它如何计算最佳选择。

Thank you! 谢谢!

First you can check this piece of code for a checkers AI that I wrote years ago. 首先,您可以检查一下我几年前编写的Checkers AI的这段代码 The interesting part is the last function ( alphabeta ). 有趣的部分是最后一个函数( alphabeta )。 (It's in python but I think you can look at that like pseudocode ). (它在python中,但我认为您可以像伪代码一样看待它)。

Obviously I cannot teach you all the alpha/beta theory cause it can be a little tricky, but maybe I can give you some practical tips. 显然,我不能教给您所有的alpha / beta理论,因为它可能有些棘手,但也许我可以给您一些实用的技巧。

Evaluation Function 评估功能

This is one of the key points for a good min/max alpha/beta algorithm (and for any other informed search algorithm). 这是良好的最小/最大alpha / beta算法(以及任何其他已知的搜索算法)的关键点之一。 Write a good heuristic function is the artistic part in AI development. 编写好的启发式函数是AI开发中的艺术部分。 You have to know well the game, talk with expert game player to understand which board features are important to answer the question: How good is this position for player X? 您必须非常了解游戏,与专业游戏玩家交谈,才能了解哪些棋盘功能对于回答以下问题很重要: 玩家X的位置好吗?

You have already indicated some good features like mobility, stability and free corners. 您已经指出了一些不错的功能,例如机动性,稳定性和自由弯道。 However note that the evaluation function has to be fast cause it will be called a lot of times. 但是请注意,评估函数必须快速,因为它将被调用很多次。

A basic evaluation function is 基本评估功能是

H = f1 * w1 + f2 * w2 + ... + fn * wn

where f is a feature score (for example the number of free corners) and w is a corresponding weight that say how much the feature f is important in the total score . 其中f是特征得分(例如,自由角的数量), w是相应的权重,表示特征f在总得分中的重要性

There is only one way to find weights value: experience and experiments. 找到权重值的方法只有一种:经验和实验。 ;) ;)

The Basic Algorithm 基本算法

Now you can start with the algorithm. 现在,您可以从算法开始。 The first step is understand game tree navigation. 第一步是了解游戏树导航。 In my AI I've just used the principal board like a blackboard where the AI can try the moves. 在我的AI中,我只是像木板一样使用了主板,AI可以尝试移动。

For example we start with board in a certain configuration B1 . 例如,我们从特定配置B1的板开始。

Step 1: get all the available moves . 步骤1:获取所有可用的动作 You have to find all the applicable moves to B1 for a given player. 您必须找到给定玩家到B1的所有适用动作。 In my code this is done by self.board.all_move(player) . 在我的代码中,这是通过self.board.all_move(player) It returns a list of moves. 它返回动作列表。

Step 2: apply the move and start recursion . 步骤2:应用移动并开始递归 Assume that the function has returned three moves ( M1 , M2 , M3 ). 假设函数返回了三个动作( M1M2M3 )。

  1. Take the first moves M1 and apply it to obtain a new board configuration B11. 进行第一步M1并应用以获得新的电路板配置B11。
  2. Apply recursively the algorithm on the new configuration (find all the moves applicable in B11, apply them, recursion on the result, ...) 在新配置上递归应用算法(找到B11中所有适用的移动,将其应用,对结果进行递归,...)
  3. Undo the move to restore the B1 configuration. 撤消移动以恢复B1配置。
  4. Take the next moves M2 and apply it to obtain a new board configuration B12. 进行下一步M2并应用它以获得新的电路板配置B12。
  5. And so on. 等等。

NOTE: The step 3 can be done only if all the moves are reversible. 注意:仅当所有移动都是可逆的时,才能执行步骤3。 Otherwise you have to find another solution like allocate a new board for each moves. 否则,您必须找到另一种解决方案,例如为每个动作分配一个新板。

In code: 在代码中:

for mov in moves :
    self.board.apply_action(mov)
    v = max(v, self.alphabeta(alpha, beta, level - 1, self._switch_player(player), weights))
    self.board.undo_last()

Step 3: stop the recursion . 步骤3:停止递归 This three is very deep so you have to put a search limit to the algorithm. 这三个非常深,因此您必须对算法进行搜索限制。 A simple way is to stop the iteration after n levels. 一种简单的方法是在n级别后停止迭代。 For example I start with B1 , max_level=2 and current_level=max_level . 例如,我从B1max_level=2current_level=max_level

  1. From B1 (current_level 2) I apply, for example, the M1 move to obtain B11. 例如,我从B1(当前水平2)申请M1,以获得B11。
  2. From B11 (current_level 1) I apple, for example, the M2 move to obtain B112. 例如,我从B11(current_level 1)开始,移动了M2以获得B112。
  3. B122 is a "current_level 0" board configuration so I stop recursion. B122是“ current_level 0”板配置,因此我停止了递归。 I return the evaluation function value applied to B122 and I come back to level 1. 我返回应用于B122的评估函数值,然后回到级别1。

In code: 在代码中:

if level == 0 :
    value = self.board.board_score(weights)
    return value

Now... standard algorithm pseudocode returns the value of the best leaf value. 现在...标准算法伪代码返回最佳叶子值的值。 Bu I want to know which move bring me to the best leaf! Bu我想知道哪一步将我带到最好的位置! To do this you have to find a way to map leaf value to moves. 为此,您必须找到一种将叶子值映射到移动的方法。 For example you can save moves sequences: starting from B1, the sequence (M1 M2 M3) bring the player in the board B123 with value -1; 例如,您可以保存移动序列:从B1开始,序列(M1 M2 M3)将玩家带到板B123中,值-1; the sequence (M1 M2 M2) bring the player in the board B122 with value 2; 顺序(M1 M2 M2)将玩家带入板B122,值为2; and so on... Then you can simply select the move that brings the AI to the best position. 依此类推...然后,您只需选择使AI达到最佳位置的举动即可。

I hope this can be helpful. 我希望这会有所帮助。

EDIT: Some notes on alpha-beta . 编辑:关于alpha-beta的一些注释。 Alpha-Beta algorithm is hard to explain without graphical examples. 没有图形示例,很难解释Alpha-Beta算法。 For this reason I want to link one of the most detailed alpha-beta pruning explanation I've ever found: this one . 因此,我想链接我所找到的最详细的alpha-beta修剪说明之一: this I think I cannot really do better than that. 我认为我真的不能做得更好。 :) :)

The key point is: Alpha-beta pruning adds to MIN-MAX two bounds to the nodes. 关键点是:Alpha-beta修剪为MIN-MAX增加了两个结点。 This bounds can be used to decide if a sub-tree should be expanded or not. 此界限可用于确定是否应扩展子树。

This bounds are: 该界限是:

  • Alpha : the maximum lower bound of possible solutions. Alpha :可能的解决方案的最大下限。
  • Beta : the minimum upper bound of possible solutions. Beta :可能的解决方案的最小上限。

If, during the computation, we find a situation in which Beta < Alpha we can stop computation for that sub-tree. 如果在计算过程中发现Beta < Alpha的情况,我们可以停止对该子树的计算。

Obviously check the previous link to understand how it works. 显然,请检查上一个链接以了解其工作原理。 ;) ;)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM