[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Bug-gnubg] TD(lambda) training for neural networks -- a question
From: |
boomslang |
Subject: |
[Bug-gnubg] TD(lambda) training for neural networks -- a question |
Date: |
Thu, 21 May 2009 00:12:46 +0000 (GMT) |
Hi all,
I have a question regarding TD(lambda) training by Tesauro (see
http://www.research.ibm.com/massive/tdl.html#h2:learning_methodology).
The formula for adapting the weights of the neural net is
w(t+1)-w(t) = a * [Y(t+1)-Y(t)] * sum(lambda^(t-k) * nabla(w)Y(k); k=1..t).
I would like to know if nabla(w)Y(k) in the formula above is the gradient of
Y(k) to the weights of the net at time t (i.e. the current net) or to the
weights of the net at time k. I assume the former.
Thanks in advance!
greetings, boomslang
- [Bug-gnubg] TD(lambda) training for neural networks -- a question,
boomslang <=