neurostat-develop
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Neurostat-develop] first ideas...


From: Joseph Rynkiewicz
Subject: Re: [Neurostat-develop] first ideas...
Date: Thu, 13 Dec 2001 23:44:29 +0100 (MET)

>I agree that the speed up is not so important, but still for massive
search
>such as genetic optimization of the network architecture, even 20% speed
gain
>is important. Nevertheless, I don't think there is problem to work both
with a
>dense storage and with a spare one, mainly thanks to "object" approach.

I have used this approach in my C++ program (hence, with a "object"
approach).
I have done a object layer with sparse representation and dense
representation.
The dense representation was used for the propagation (and the
back-propagation) if the proportion of zeros in matrix was less than 20%. 

I can say that 99% of vicious bugs where cause by this ambiguous
representation. 

Namely, the BIG problem is the synchronization of the both representation. 
If you update the weight of the MLP, you have to copy this new parameters
in the dense representation. 
(for the sparse representation this is automatic since it point directly
to the weights).

I know that it seems obvious to never forget to call a synchronization
routines as soon as you have to update the weights, but one day in a
complex optimization algorithm  in a mixture of experts models, you will
forget to do that, and you will spent one month to retrieve the bug...

Because I have already used this optimization "technique", I really think
that it is a not so small overhead, for a very small gain.

To enforce my opinion I will cite God :

"We should forget about small efficiencies, say about 97% of the time:
premature optimization is the root of all evil." -- Donald Knuth

At least we can keep for later this double representation.

(Yes I know I seem to have an obsession of sparse matrix, but I have
already done a prunable MLP ;  it is really not so obvious to manage the
holes in the architecture and I think that the sparse notation like
compressed row or column storage help a lot.)

>Ok, so here is how I see a possible header:

>int propagate(MLP *m,double *param,int param_size,double *input,int
>in_size,double *result,int out_size)

>The return value is an error code or something similar (maybe it is not
>needed). (double *,int) pairs are "vectors". Everything else is embedded
in
>the MLP struct. It contains of course a description of the architecture,
which
>is basically a way to interpret the parameter vector (i.e. param), which
can
>contain dense matrices and vectors, or sparse ones. We also embed in the
>struct all the needed cache (pre-output of each layer, etc.). 

>I don't know if we really need to have all the sizes because they are
already
>in the MLP struct. But it allows at least to test for adequacy. Maybe it
is
>not needed in the low level API.

>Comments?

I think that we can embed the param in the MLP too. 
Essentially because a lot of decision for this parameter vector are made
insight the MLP.
For example, when you decide to prune one weight, you have to verify the
architecture of the MLP (to seek unused hidden units) and  recalculate the
architecture, so the parameter dimension. Then,  you have to reconstruct a
new parameter vector, because some coefficients are maybe become useless.
This interaction between the architecture and the parameter advocate for
the integration of the parameter inside of the MLP.  

So the header could be  

int propagate(MLP *m,double *input, int in_size,double *result,int
out_size)

With a lot of think in the struct MLP (Parameter, architecture,...)

Finally the sizes can be keeped, and we can test the adequacy of the size
for debugage purpose.
(with #ifdef DEBUG ....#endif).

Joseph


  

 




reply via email to

[Prev in Thread] Current Thread [Next in Thread]