bug-gnubg
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Bug-gnubg] Problems with latest Windows Build


From: Ned Cross
Subject: RE: [Bug-gnubg] Problems with latest Windows Build
Date: Thu, 4 Nov 2004 19:19:14 -0800

While I do not have convincing proof, I just reported one specific example
to this list on 10/26.  I had started some other rollout comparisons, but I
will wait until I can get a copy of the latest version installed with the
newest updates before posting any more.

>From a user perspective, the most appropriate test is 2-ply 50% reduced
checkerplay (cubeful, according to score) vs 2-ply pruning, 2-ply 33%
reduced cube vs 2-ply pruning.  The reason is these settings have been shown
by the power users of the bg community to be the best combination of speed
and playing strength.

2-ply 33% reduced checkerplay was shown to have enough of a loss of playing
strength to be considered unreliable by most for rollouts of complex
positions, and 0-ply rollouts have been shown to be pretty good for simpler
positions.

--Ned

-----Original Message-----
From: Joseph Heled [mailto:address@hidden
Sent: Tuesday, November 02, 2004 11:26 PM
To: Ned Cross
Cc: gnubg (E-mail)
Subject: Re: [Bug-gnubg] Problems with latest Windows Build




Ned Cross wrote:
  else is having the trouble.
>
> 5) Of course the biggest problem of all is the sometimes poor results of
the
> pruning net in rollouts combined with the inability to use the old
> evaluations at reduced speed.  2-ply rollouts performed at 50% checker and
> 33% cube speed have proven quite accurate and considerably faster than
2-ply
> 100%.  2-ply prune rollouts, while faster per number of trials, are not
> proving as accurate, and have very high standard error rates, making them
> possibly unusable.

Can you provide a convincing proof of that? Of course you are free to use
whatever setting you like, but the reduced evaluations are worse in every
test I
did. For example it has about 10 times the error rate when tested over the
contact benchmark. (I just did several thousand positions. will do the full
test
when I get more computing power).

Of course those are done using my reduced code (only 33% for moves), and
GNUbg
might be different implementation, I have not looked at this for a long
time.

But I simply can't believe large differences. And you want to compare full
2ply
with pruning and full2 ply vs. reduced, not against one another.

-Joseph





reply via email to

[Prev in Thread] Current Thread [Next in Thread]