bug-gnubg
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Bug-gnubg] bug in 3 ply equities


From: Misja Alma
Subject: RE: [Bug-gnubg] bug in 3 ply equities
Date: Wed, 03 Dec 2003 20:08:40 +0100

Hi,

First of all I apologize that I'm reopening such an old thread. But I just
noticed that it is about the same 'problem' that I have noticed ( and posted
in this newsgroup as well).

I agree with Joseph that these positions, which gnubg does not seem to
understand, don't come up much and thus have not much effect on gunbg's
total playing strength. But still it would be nice to find a solution for
these wouldn't it ...
Joseph mentioned the possibility of having a separate net for these
positions, a sub-class of the crashed. But the position I noticed,
ObsdAAhi2zYQEA, is not crashed yet. It is just about containing one checker
with an outer prime.

I guess that the advantage of a separate 'containment' net is, that only
this net needs to be trained. Because training the whole net would be
impossible without increasing alpha I suppose? I mean, gnubg understands
those positions so poorly that training the net on those positions with a
small alfa would not converge to a right playing style of those postions I
think. But when you increase alpha and start training the whole net, the
other area's of the game, which gnubg plays very well, are messed up. Or do
I understand this wrongly?

The reason I'm asking all this is that I'm thinking of building and training
a (sub-)net on my own. I would train it with rollouts obtained from Snowie
until it had a reasonable enough strength to train itself with its own
rollouts.
It would be easiest to train a little subnet specialized in containment
positions, but I am not sure if it is possible for gnubg to recognize such
positions.
Also I'm not familiar with the gnubg code, so I don't know how easy or
difficult it is to 'plug in' a new subnet. If any of you could provide me
with any tips or suggestions on that subject I would very much appreciate
it!

Ciao,
Misja

-----Oorspronkelijk bericht-----
Van: address@hidden
[mailto:address@hidden Øystein
Johansen
Verzonden: Tuesday, November 18, 2003 12:13 AM
Aan: address@hidden
CC: Hugh Sconyers; gnubg (E-mail)
Onderwerp: Re: [Bug-gnubg] bug in 3 ply equities


Joseph Heled wrote:
>
> I agree we need a better net for containment cases, a sub-class of
> crashed. The bad news is I am still unable to train such a beast,
> perhaps due to chicken-and-egg problem, perhaps because of other
> problems as well. I need fresh ideas, which might come over time, or if
> someone else joins me in working on the nets.

Aha! Split the crashed net into "crashed" and "contain"! How many
positions do we need in a contain training database? Is it possible to
do _very_ simple manual rollouts of the positions, say 108 games with
variance reduction. Possible? Feasable? Distributed to voluenteers of
course.

My personal brainstorming continues:
What if we bootstrap a contain net with TD training? Instead of starting
from the starting position, we start with a contain position initaially,
or several contain positions done in random sequence? does this seems to
be possible? Igeuss the weights might not converge to good values but it
might be worth a try.

> However, positions such the above are not a big concern at this stage. I
> know it is an eyesore to see such evaluations. I know your confidence
> might be shaken each time you see it, but my main concern is playing
> strength. GNUbg checkers play is reasonable here, and will not make
> wrong cube decisions at most scores. Later, when we get play problems
> out of the way, we can aim higher and see if we can get more absolute
> equities right.

I see your point. If we just can get slightly better checker play we can
build a database of rollouts for this position type, which can be used
for supervised training.

-Øystein




_______________________________________________
Bug-gnubg mailing list
address@hidden
http://mail.gnu.org/mailman/listinfo/bug-gnubg






reply via email to

[Prev in Thread] Current Thread [Next in Thread]