bug-gnubg
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-gnubg] Measuring performance levels


From: Joseph Heled
Subject: Re: [Bug-gnubg] Measuring performance levels
Date: Wed, 23 Oct 2002 22:42:28 +1300

Douglas Zare wrote:
> 
> In addition to measuring the correctness of absolute evaluations or 
> correctness
> of individual decisions, I think it would be nice to measure the ability to
> execute a game plan. This is hard to measure objectively in many important
> situations, but not positions of one-sided errors. I described this in more
> detail in example 3 of my latest (October 22nd) column in GammonVillage.
> 
> Here is the position (rolled out 10 times with different settings in the
> column).
> 
> --------------------------------------------------------------------
> |                     zare (X) vs. Snowie (O)                      |
> --------------------------------------------------------------------
> 
>  Money session. Score X-O: 0-0
> 
>            X on roll, cube action
>            +24-23-22-21-20-19-------18-17-16-15-14-13-+
>            | O  O  O     X    |   |                 X |
>            | O  O  O     X    |   |                   |
>            | O  O  O          |   |                   | S
>            | O  O             |   |                   | n
>            | 6  O             |   |                   | o
>            |                  |BAR|                   | w
>            |                  |   |                   | i
>            |                  |   |                   | e
>            |          X  X  X |   |  X                |
>            |          X  X  X |   |  X                |
>            |       O  X  X  X |   |  X                |
>            +-1--2--3--4--5--6--------7--8--9-10-11-12-+
>            Pipcount  X: 119  O:  47  X-O: 0-0/Money (1)
>            CubeValue:  1
> 
>           Rollout      Money equity: 0.505
>                0.1%   3.6%  77.0%    23.0%   7.2%   0.0%
>                95% confidence interval:
>                   - money cubeless eq.: 0.505 ±0.013.
>                Rollout settings:
>                   Full rollout,
>                   21600 games (equiv. 24650 games),
>                   played 1-ply,
>                   seed 11, with race database.
>                 1.  Double, take      0.846
>                 2.  No double         0.721  (-0.126)
>                 3.  Double, pass      1.000  (+0.154)
>           Proper cube action: Double, take
> 
>  ------------------------------ End ----------------------------------
> 
> Since O can very rarely make an error in cubeless money play, the result of 
> the
> rollout is a good indication of the strength of the bot. A higher equity for X
> means that the bot plays this position better. The rollouts indicated that for
> this position, Jellyfish Level 6 plays worse than Snowie 3 1-ply, and that
> Snowie 4 2-ply (medium) played worse than Snowie 3 3-ply. So, how does gnubg
> fare on different settings? (It is important that the rollout be cubeless and
> untruncated, with checker play according to the usual cubeless money gammon
> price.)
> 
> Douglas Zare
> 
> _______________________________________________
> Bug-gnubg mailing list
> address@hidden
> http://mail.gnu.org/mailman/listinfo/bug-gnubg

at 0ply, 12960 games I got

  0.07% 3.6% 77.06% 23% 7.08% 0.0%

So this is the same as the above (SN 3?). I leave higher plies to someone with a
stronger machine.

-Joseph




reply via email to

[Prev in Thread] Current Thread [Next in Thread]