neurostat-develop
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Neurostat-develop] first ideas...


From: Fabrice Rossi
Subject: Re: [Neurostat-develop] first ideas...
Date: Tue, 11 Dec 2001 15:29:38 +0100

Joseph Rynkiewicz wrote:
> 
> The first (maybe not the last) neuronal object to do is the famous
> Multilayer Perceptron (MLP).
> 
> The idea is to do a "R" library because :
> 
> 1) It's the best statistical free-software.
> 2) We don't reinvent the wheel for the post(pre)treatment of data.

I also would like to consider our library to be usable without R, for instance
as an add-on component to GSL. That's why I think we should focus on very low
level primitives: calculation of the output and of the differentials of an
MLP. I don't think it's a good idea to provide more, because everything else
needed, especially optimization algorithms, should be provided by the
environement (for instance R or GSL). It would be better for us to contribute
missing bits to GSL and to R rather than to integrate them in our library.

> I think that we can do more ambitious library, especially by relaxing
> the constraint on the number of layer and allowing the MLP to be pruned.

I strongly support this point. Pruning is very important in practical settings
and is not generally included in available libraries.

> This goal has two consequences :
> 
> (1) We have to carefully consider the implementation of the architecture
> of the MLP especially the possibility of shortcut connections jumping
> over layers. althought we use "C" it can be a good idea to use a object
> oriented philosphy and abuse of "typedef struct..."

Of course:

> (2) It's more elegant to use matrices with holes (sparse matrices) to
> implement the connections between the layers, since our MLP has to be
> pruned.
> 
> So, I propose to use sparse matrices for the connections, moreover I
> think that it's a good idea to use a "sparse vector" especially for the
> bias's connections since their role is very different in the
> back-propagation algorithm.

We have a problem here, for several reasons:

1) With dense matrices, we can use efficient BLAS implementations (ATLAS seems
to give quite good results compared to vendor optimized BLAS). There is a
sparseblas on netlib (in fortran), but I don't think it is close to the
optimization level of ATLAS (authors of ATLAS plan to work on sparse matrices,
but it is not the case right now, I think).

2) When the number of holes is small, the simplest way to store and to
calculate with corresponding matrices is to use a dense representation. The
main problem is that we have to force the learning algorithm to keep missing
weights to zero. It is easy to replace real partial differential of the error
with respect to a constant weight by zero. Conjugate gradient like algorithms
are linear (ie the descent direction is a linear function of the previous
direction and of the gradient) and therefore they should work with this scheme
quite easily. But other algorithms will need a specific support.

3) In general, if we want to use optimization algorithms, we need to rely on a
flat representation of the numerical parameters of the MLP (a "vector"). I
don't know many things about sparse matrices and so I wonder how this
requierement can be achieved.

I do agree that mixing bias vectors and connection matrices is not a good
idea.

> The code is not optimized but we can wait to have a working MLP before
> thinking of optimization.

That's right, but the design must be compatible with future optimization.

I propose to start working on a prototype C API as a discussion support. I
will try to post something before Xmas.

Fabrice



reply via email to

[Prev in Thread] Current Thread [Next in Thread]