neurostat-develop
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Neurostat-develop] API proposal


From: Fabrice Rossi
Subject: [Neurostat-develop] API proposal
Date: Thu, 07 Mar 2002 17:58:02 +0100

Yes folks, the project is not entirely dead!

Joseph and I had a discussion outside the list about specification of the low
level API. Here is a basic summary of our conclusions:

------------------------------------------------------------

1) The goal is to implement layered architectures. A MLP architecture is
   described by l layers (we consider only active layers, that is we don't
   include an "input layer"). We have "full" connection between successive
   layers (layer k receive the output of layer k-1). A layer may receive
   outputs of all preceding layers, including from the input. 

2) The architecture of the MLP (number of layers, neuron by layer, input size,
   activation functions and skipping connections) is described by a struct.

3) Forward calculation is performed by a function with this prototype:

int propagate(MLP *arch,double *weights,double *input,double *output,
              Activation *act,Activation *dact)

- arch is a pointer on the MLP struct
- weights is the weight vector
- input is the input vector
- output is the place to write the output of the MLP
- act is a pointer on an Activation struct which is used for the following
  tasks :
  - the output of each layer must be stored in order to prepare the
    calculation of the following layers
  - these outputs will be used for the back-propagation
- dact is also a pointer on an Activation struct which is used to store
  derivatives of the output of each layer: that will be used by the
  backpropagation. If the pointer is null, the corresponding information are
  not calculated.

4) Back propagation is performed by a similar function:

int back_propagate(MLP *arch,double *weights,double *error,double *jacobian,
                   Activation *dact)

The very important part here is that the jacobian is a pointer to a matrix
which contains partial derivatives of the "error" with respect to the output
of
the neurons. It does not contain the jacobian of the error considered as a
function of the weights.

error is a pointer on a matrix of values to back propagate. Technically it's
the derivative of the "error" with respect to the output of the neurons of the
last layer. This allows to back-propagate anything, including for instance the
output of the network, so as to calculate the Jacobian matrix. 

It might be a good idea to provide an optimized version when error is a vector
rather than a matrix.

As in propagate, dact is a pointer on an Activation struct that has been used
to store derivatives during the propagation phase. act is not needed during
the back-propagation phase.

5) Calculation of derivatives with respect to network weights is done by
   another function:

int weights_derivative(MLP *arch,Activation *act,Activation *dact,
                       double *jacobian,double *w_jacobian)

act and dact come from propagate, whereas jacobian comes from back-propagate. 

w_jacobian is a matrix in which the jacobian of what has been back-propagated
is stored after calculation. weights are not needed during this calculation.

6) Calculation of derivatives with respect to network inputs is done by yet
   another function:

int inputs_derivative(MLP *arch,double *weights,Activation *dact,
                      double *jacobian,double *in_jacobian)

The rationnal of this function is exactly similar to previous one's. 

------------------------------------------------------------

I think that next step is to write very high level algorithms for those
functions so as to check everything is correct. I don't think we need right
now to provide hooks for second order derivatives calculations, but we still
have to think about them in order to be sure no major revision of the low
level architecture will be needed. 

Comments are welcome.

Fabrice



reply via email to

[Prev in Thread] Current Thread [Next in Thread]