Attempt #3 to send this! Argh!
To give you an idea of the effectiveness of the *-minimax algorithm, here
are some results from experiments I have been running tonight (and my
testing script will continue to run for another six hours or so) of my
Dice game. My Dice game is basically a glorified tic-tac-toe game with
arbitrary-sized (square) board. The only catch is, before moving, an
N-sided die is rolled. For player X, the die roll determines the row X can
move into, and for player Y, the column. Full rows or columns result in a
turn that must be passed.
This domain is very simple and a good one for testing these algorithms. It
is not backgammon, but to give us an idea on *-minimax's success for
backgammon, I've been running experiments with a 20x20 board (to simulate
the branching factor of a backgammon game tree with its 21 distinct rolls
and ~20 moves per roll).
My evaluation function consists of nothing more than a difference of
pairs. The evaluation function goes through the board square by square
and, when it sees a consecutive pair of Xs or Os, increases the pair count
for X (or O) by two if there is an empty square on either size of the
pair, by one if only a single empty square is present, or by zero if the
pair is surrounded by the edge of the board and/or opponent pieces. It
also counts "split" pairs, like X.X .
In terms of profiling, my evaluation function is very "heavy". (As
relatively heavy as your NN is with your game? I'm not sure). Most of the
CPU time used (>90%) goes strictly to the evaluation function. In
comparision, my term_board() function probably uses ~1% of total time, and
the various search routines ~3% of total time. Even my routines for
storing and looking up entries in my transposition table are in the single
digits.
So here is a sample batch of results (chosen randomly from 16 completed
tests already). I will just show some statistics for expecitmax and star2.
Both searched to depth=5 (that is, where the root is at depth=0 and each
level of nodes, chance or minimax, is considered an increase in depth). So
the tree has the form Root(Max)-Chance-Min-Chance-Max-Chance-Min. This is
roughly 2.5-ply, I believe...
Search method: expectimax
Score: -121
Internal nodes: 2545991
Leaf nodes: 48018977
Total nodes: 50564968
TOTAL NODES: 50564968
eval_board calls: 47979820
term_board calls: 2452328
Time used (s): 439
nodes/s: 115182
Search method: star2
Score: -121
Internal nodes: 13548
Leaf nodes: 6029
Total nodes: 19577
Probe internal nodes: 121481
Probe leaf nodes: 1841876
Total probe nodes: 1963357
TOTAL NODES: 1982934
Cutoffs: 35695
eval_board calls: 1841987
term_board calls: 475912
Average CutBF: 4.15
Time used (s): 17
nodes/s: 116643
Okay, so let's look at the data and see what we have.
Node expansions: the difference between the number of node expansions
is 50564968 vs. 1982934, or a factor of 25.50.
eval_board() calls:
47979820/1841987 (factor of 26.04)
Not surprisingly, the reduction in # of node expansions is nearly the same
as the reduction in calls to my evaluation function. That is great, since
we have a heavy evaluation function, and don't want to call it often.
term_board() calls:
2452328/475912 (factor of 5.15)
Small note, but can still be important. I will point out that my terminal
board function is far more heavier than a backgammon term_board()
function, because for BG all you have to do is see how many checkers are
off the board (if either player has 15, game is over). Still, there is
improvement.
Time:
439s/17s (factor of 25.82)
As I mentioned before, reducing node expansions is good, but not as good
if it comes at an increase in real CPU time, because we're really
interested in real-world performance. (For an example of an algorithm that
reduces node expansions with increased time overhead, see the learning A*
algorithms, like LRTA*). Note that the factor is also nearly identical to
the node expansions. This is great!
To reiterate what was mentioned before, star2 (the 2nd version of the
*-minimax algorithm) is dependent on a good, FAST method for ordering
moves (and used for choosing nodes to "Probe"... what "probing" means in
the *-minimax context, I will leave aside for now, because you'll need a
copy of the paper to follow along). However, I am confident I can achieve
a good method for doing so without resorting to the NN for ordering. I
will probably start with using the NN anyway, just to use as another data
point.
Hopefully the data structure you are using to represent the current game
"state" is good/easy to understand, and the calls to the evaluation
function relatively straightforward. That way I can just "rip out" the
search routine, plop in my implementation of star1 and star2, and just
call your NN to evaluate a position when required. Is there also a way to
generate moves, and apply those moves to a game state? That's how my Dice
game is programmed right now -- I use macros to abstract the game design
from the search, so all the search routine does is generate "moves" and
"apply" them to a "state", passing that state down in a function call.
"Man" I've sure "used" a "lot" of "quotation marks". I'll chalk that up to
lack of "sleep".