This is a very interesting subject and one that I've
always wondered about with gnubg. More generally, it's easy to notice
an even/odd-ply "oscillation" effect with gnubg when you compare the
equities of f.i. 0-,1-,2-,3- and 4-ply evaluations. This effect is much
less pronounced in f.i. Snowie 4.
It is important to start with a distinction between cube and
checker play evaluations, since they are effectively 1-ply shifted
(rolling the dice after you consider the cube decision, is equivalent
to one ply). So odd/even effects are sort of mirrored between cube and
checker play evaluations.
For cube decisions, by far most positions you come across in
backgammon games, gnubg odd-ply gives a higher equity than even-ply.
Again for cube decisions, many rollouts through the times suggest that
for most positions, even-ply cube evaluations are better (i.e. closer
to the rollout, as well as believed by expert players to be more
accurate). For that reason, I'd sure be willing to back gnubg's
even-ply cube evaluations against its odd-ply cube evaluations, when it
comes to playing complete backgammon games. I think you can say there's
consensus among experienced gnubg users and expert backgammon players
that in general, 2-ply CUBE is better than 3-ply CUBE.
Ian Shaw's example of high anchor holding games is pretty much the exception to this rule.
For
CHECKER play, things are different. First of all, when you look at the
equities for checker play evaluations, the effect is reversed because
of the one ply shift. So now even-ply nearly always gives a higher
absolute equity than odd-ply, and this time, it's mostly the odd-ply
equities that are closer to rollout figures.
However, for making move decisions, the absolute equities are not
of direct importance; it's all about the relative ordering of candidate
moves, or even simpler, just having the best move on top suffices for
checker play, regardless of whether the absolute equity estimate is
accurate. It turns out that even-ply general overestimations of the
equities don't seem to harm its checker play; quite the contrary. Gnubg
seems to be relatively better with checker play at even-ply; however,
adding an extra ply lookahead seems often useful enough with checker
play that it's still just worth it, despite going to odd-ply.
As far as I know, no-one has ever done any serious statistical
study of 3-ply checker play (the time involved is a problem). So it's
mostly observational evidence from experienced users that 3-ply overall
is hardly better than 2-ply, if at all, for checker play. It certainly
is different though. The problem is that whereas 3-ply gets some
problems right that 2-ply gets wrong, the reverse also happens quite a
lot: 3-ply gets problems wrong that 2-ply got right.
There has been serous statistical tests of 0-ply, 1-ply and 2-ply
checker play though, and if I remember the figures correctly, the
result is that 1-ply gains 0.012ppg over 0-ply checker play, and that
2-ply gains 0.060ppg. This shows the relaitively small gain 1-ply gains
over 0-ply. It's interesting to note that Snowie (4) users report
something quite different; Snowie without lookahead plays relatively
bad (much worse than gnubg 0-ply), but Snowie gains a lot when looking
a ply ahead.
I'll put some examples in another post to make things more clear.
For
experimenting yourself with this phenomena, a quick way is to evaluate
a position at 0- 1- 2- 3- and 4-ply and copy/paste all the results in
the annotation section, then do a good rollout and compare the results
to these 5 evaluations.
If you're just interested in "strong" settings for gnubg that don't
take too long, the advice is simple: use even-ply both for cube and
checker play; preferably 2-ply for both (assuming you can't afford
4-ply or higher...).