[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[gnugo-devel] Scoring of professional games.
From: |
Gunnar Farneback |
Subject: |
[gnugo-devel] Scoring of professional games. |
Date: |
Fri, 01 Aug 2003 12:45:46 +0200 |
User-agent: |
EMH/1.14.1 SEMI/1.14.3 (Ushinoya) FLIM/1.14.2 (Yagi-Nishiguchi) APEL/10.3 Emacs/20.7 (sparc-sun-solaris2.7) (with unibyte mode) |
At http://www.andromeda.com/people/ddyer/go/scoring-games.html, Dave
Dyer discusses the problem of scoring completed games and evaluates
the performance of a program he has written. This is done using a test
set of 623 scored professional games. Although he doesn't allow
redistribution of those games he gives them away to anyone who asks
for them (or at least he used to).
So, how does GNU Go do on these? First one should be aware that
professional game records generally end when there are no more points
to gain, i.e. before filling dame and without playing out threats to
make more points. To make things even more complex they sometimes end
with one or more endgame kos unresolved. Thus these tests are actually
more about endgame and small ko fighting skills than actual scoring
skills. I've done a lot of tuning based on this set, which reflects in
the following table of the distribution of scoring errors:
3.0 3.2 3.4
------------------------
0 404 448 512
1 146 112 99
2 37 31 7
3-5 18 11 2
6-10 13 9 2
11- 5 12 1
Thus GNU Go 3.4 gets the correct score for 82.2% of the games, is off
by at most one for 98.1% of the games and is off by at most two for
99.2% of the games.
Obviously an unbiased test set would fare worse than this one, but I
don't think there's any question that 3.4 is significantly stronger in
this regard than previous versions.
/Gunnar
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- [gnugo-devel] Scoring of professional games.,
Gunnar Farneback <=