Re: [Bug-gnubg] Measuring performance levels 2

bug-gnubg

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-gnubg] Measuring performance levels 2

From:	Douglas Zare
Subject:	Re: [Bug-gnubg] Measuring performance levels 2
Date:	Mon, 28 Oct 2002 02:06:28 -0500
User-agent:	Internet Messaging Program (IMP) 3.1

Quoting Joern Thyssen <address@hidden>:

> On Sat, Oct 26, 2002 at 07:17:11PM +0000, Joern Thyssen wrote
> > On Sat, Oct 26, 2002 at 06:05:13AM -0400, Douglas Zare wrote
> > > 
> > > > How about 1 closed out, extras on the 23, 10, and 7, versus 13 off with
> a
> > > > blot on the 1? 
> > 
> > Just to make sure:
> > 
> >  GNU Backgammon  Position ID: AQAA2rYtAoAAAA
> >                  Match ID   : cAkAAAAAAAAA
> >  +13-14-15-16-17-18------19-20-21-22-23-24-+     O: gnubg
> >  |                  | O |             X  O | OOO 0 points
> >  |                  |   |                  | OOO
> >  |                  |   |                  | OOO
> >  |                  |   |                  | OO
> >  |                  |   |                  | OO
> > v|                  |BAR|                  |     (Cube: 1)
> >  |                  |   |                  |
> >  |                  |   |                  |
> >  |                  |   |                  |
> >  |                  |   | X  X  X  X  X  X |     On roll
> >  |       X        X |   | X  X  X  X  X  X |     0 points
> >  +12-11-10--9--8--7-------6--5--4--3--2--1-+     X: jth
> > 
> > > 
> > > In 90,000 trials, Snowie 3 1-ply wins 15.5% (-0.691 +-0.004). 
> > 
> gnubg 0-ply 93312 trials:
> 19.1% (+- 0.0005) or equity -0.6184 +- 0.001
> 
> gnubg 1-ply (16 cand, 0.32 tol.) 9076 trials:
> 19.11% (+- 0.00015) or equity -0.6183 +- 0.003

Snowie 3 3-ply Huge 33% 4500 trials:
19.8% (+- 0.3%). -0.603+-0.006 confidence interval.

I'm surprised by how well gnu's variance reduction works. If those figures are 
accurate, it means that the luck estimate is off by about 1.5% for a single 
trial! Snowie's is off by 10% (sqrt(4500)*0.3%/2). Perhaps it is because gnu 
understands the absolute evaluations much better on 0-ply. Do the 0-ply 
evaluations match the 0-ply rollouts to within a percent?

I look forward to seeing gnu 2-ply rollout data, to see if gnu 2-ply plays this 
one better than Snowie 3. It's impressive that gnu 0-ply does better than 
Snowie 3 2-ply here, and almost as well as Snowie 3 3-ply, but it is strange 
that 1-ply plays at the same level.

Douglas Zare

[Prev in Thread]

Current Thread

[Next in Thread]

Re: variance reduction [Was Re: [Bug-gnubg] Measuring performance levels], (continued)
- Re: [Bug-gnubg] Measuring performance levels, Morten Wang, 2002/10/25
  - Re: [Bug-gnubg] Measuring performance levels, Douglas Zare, 2002/10/25
    - Re: [Bug-gnubg] Measuring performance levels 2, Douglas Zare, 2002/10/26
    - Re: [Bug-gnubg] Measuring performance levels 2, Joern Thyssen, 2002/10/26
    - Re: [Bug-gnubg] Measuring performance levels 2, Joern Thyssen, 2002/10/26
    - Re: [Bug-gnubg] Measuring performance levels 2, Douglas Zare <=
    - Re: [Bug-gnubg] Measuring performance levels 2, Morten Wang, 2002/10/28
    - Re: [Bug-gnubg] Measuring performance levels 2, Joern Thyssen, 2002/10/28
    - Re: [Bug-gnubg] Measuring performance levels 2, Morten Wang, 2002/10/28
    - Re: [Bug-gnubg] Measuring performance levels 2, Joseph Heled, 2002/10/29
    - Re: [Bug-gnubg] Measuring performance levels 2, Joseph Heled, 2002/10/29
    - Re: [Bug-gnubg] Measuring performance levels 2, Morten Wang, 2002/10/29
    - Re: [Bug-gnubg] Measuring performance levels 2, Joseph Heled, 2002/10/29
    - Re: [Bug-gnubg] Measuring performance levels 2, Joseph Heled, 2002/10/29
    - Re: [Bug-gnubg] Measuring performance levels 2, Jim Segrave, 2002/10/30

Prev by Date: Re: [Bug-gnubg] Re: more on bearoff databases
Next by Date: Re: [Bug-gnubg] Measuring performance levels 2
Previous by thread: Re: [Bug-gnubg] Measuring performance levels 2
Next by thread: Re: [Bug-gnubg] Measuring performance levels 2
Index(es):
- Date
- Thread