简体   繁体   中英

How to compute blot exposure in backgammon efficiently

I am trying to implement an algorithm for backgammon similar to td-gammon as described here .

As described in the paper, the initial version of td-gammon used only the raw board encoding in the feature space which created a good playing agent, but to get a world-class agent you need to add some pre-computed features associated with good play. One of the most important features turns out to be the blot exposure.

Blot exposure is defined here as:

For a given blot, the number of rolls out of 36 which would allow the opponent to hit the blot. The total blot exposure is the number of rolls out of 36 which would allow the opponent to hit any blot. Blot exposure depends on: (a) the locations of all enemy men in front of the blot; (b) the number and location of blocking points between the blot and the enemy men and (c) the number of enemy men on the bar, and the rolls which allow them to re-enter the board, since men on the bar must re-enter before blots can be hit.

I have tried various approaches to compute this feature efficiently but my computation is still too slow and I am not sure how to speed it up.

Keep in mind that the td-gammon approach evaluates every possible board position for a given dice roll, so each turn for every players dice roll you would need to calculate this feature for every possible board position.

Some rough numbers: assuming there are approximately 30 board position per turn and an average game lasts 50 turns we get that to run 1,000,000 game simulations takes: (x * 30 * 50 * 1,000,000) / (1000 * 60 * 60 * 24) days where x is the number of milliseconds to compute the feature. Putting x = 0.7 we get approximately 12 days to simulate 1,000,000 games.

I don't really know if that's reasonable timing but I feel there must be a significantly faster approach.

So here's what I've tried:

Approach 1 (By dice roll)

For every one of the 21 possible dice rolls, recursively check to see a hit occurs. Here's the main workhorse for this procedure:

private bool HitBlot(int[] dieValues, Checker.Color checkerColor, ref int depth)
    {
        Moves legalMovesOfDie = new Moves();

        if (depth < dieValues.Length)
        {
            legalMovesOfDie = LegalMovesOfDie(dieValues[depth], checkerColor);
        }

        if (depth == dieValues.Length || legalMovesOfDie.Count == 0)
        {
            return false;
        }

        bool hitBlot = false;

        foreach (Move m in legalMovesOfDie.List)
        {
            if (m.HitChecker == true)
            {
                return true;
            }

            board.ApplyMove(m);
            depth++;
            hitBlot = HitBlot(dieValues, checkerColor, ref depth);
            board.UnapplyMove(m);
            depth--;

            if (hitBlot == true)
            {
                break;
            }
        }

        return hitBlot;
    }

What this function does is take as input an array of dice values (ie if the player rolls 1,1 the array would be [1,1,1,1]. The function then recursively checks to see if there is a hit and if so exits with true. The function LegalMovesOfDie computes the legal moves for that particular die value.

Approach 2 (By blot)

With this approach I first find all the blots and then for each blot I loop though every possible dice value and see if a hit occurs. The function is optimized so that once a dice value registers a hit I don't use it again for the next blot. It is also optimized to only consider moves that are in front of the blot. My code:

public int BlotExposure2(Checker.Color checkerColor)
    {
        if (DegreeOfContact() == 0 || CountBlots(checkerColor) == 0)
        {
            return 0;
        }

        List<Dice> unusedDice = Dice.GetAllDice();

        List<int> blotPositions = BlotPositions(checkerColor);

        int count = 0;

        for(int i =0;i<blotPositions.Count;i++)
        {
            int blotPosition = blotPositions[i];

            for (int j =unusedDice.Count-1; j>= 0;j--) 
            {
                Dice dice = unusedDice[j];

                Transitions transitions = new Transitions(this, dice);

                bool hitBlot = transitions.HitBlot2(checkerColor, blotPosition);

                if(hitBlot==true)
                {
                    unusedDice.Remove(dice);

                    if (dice.ValuesEqual())
                    {
                        count = count + 1;
                    }
                    else
                    {
                        count = count + 2;
                    }
                } 
            }
        }


        return count;
    }

The method transitions.HitBlot2 takes a blotPosition parameter which ensures that only moves considered are those that are in front of the blot.

Both of these implementations were very slow and when I used a profiler I discovered that the recursion was the cause, so I then tried refactoring these as follows:

  1. To use for loops instead of recursion (ugly code but it's much faster)
  2. To use parallel.foreach so that instead of checking 1 dice value at a time I check these in parallel.

Here are the average timing results of my runs for 50000 computations of the feature (note the timings for each approach was done of the same data):

  1. Approach 1 using recursion: 2.28 ms per computation
  2. Approach 2 using recursion: 1.1 ms per computation
  3. Approach 1 using for loops: 1.02 ms per computation
  4. Approach 2 using for loops: 0.57 ms per computation
  5. Approach 1 using parallel.foreach: 0.75 ms per computation 6 Approach 2 using parallel.foreach: 0.75 ms per computation

I've found the timings to be quite volatile (Maybe dependent on the random initialization of the neural network weights) but around 0.7 ms seems achievable which if you recall leads to 12 days of training for 1,000,000 games.

My questions are: Does anyone know if this is reasonable? Is there a faster algorithm I am not aware of that can reduce training?

One last piece of info: I'm running on a fairly new machine. Intel Cote (TM) i7-5500U CPU @2.40 GHz.

Any more info required please let me know and I will provide.

Thanks, Ofir

Yes, calculating these features makes really hairy code. Look at the GNU Backgammon code. find the eval.c and look at the lines for 1008 to 1267. Yes, it's 260 lines of code. That code calculates what the number of rolls that hits at least one checker, and also the number of rolls that hits at least 2 checkers. As you see, the code is hairy.

If you find a better way to calculate this, please post your results. To improve I think you have to look at the board representation. Can you represent the board in a different way that makes this calculation faster?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM